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INTRODUCTION 

Field of Invention 

This invention relates to recombinant proteins, genes, 
and gene probes and more specifically to such proteins and 
probes derived from an enterically transmitted nonA/nonB 
hepatitis viral agent, to diagnostic methods and vaccine 
applications which employ the proteins and probes, and to gene 
segments that encode specific epitopes (and proteins 
artificially produced to contain those epitopes) that are 
particularly useful in diagnosis and prophylaxis. 
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Backcrround 

Enterically transmitted non-A/non-B hepatitis viral 
agent (ET-NANB; also referred to herein as HEV) is the reported 
cause of hepatitis in several epidemics and sporadic cases in 
Asia, Africa, Europe, Mexico, and the Indian subcontinent. 
Infection is usually by water contaminated with feces, although 
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the virus may also spread by close physical contact. 
The virus does not seem to cause chronic infection. 
The viral etiology in ET-NANB has been demonstrated by 
infectic of volunteers with pooled fecal isolates; 
5 immune electron microscopy (lEM) studies have shown 
virus particles with 27-34 nm diameters in stools 
from infected individuals . The virus particles reacted 
with antibodies in serum from infected individuals 
from geographically distinct regions, suggesting that 
10 a single viral agent or class is responsible for the 
majority of ET-NANB hepatitis seen worldwide. No 
antibody reaction was seen in serum from individuals 
infected with parenterally transmitted NANB virus 
(also known as hepatitis C virus or HCV) , indicating 

15 a different specificity between the two NANB types. 

In addition to serological differences, the 
two types of NANB infection show distinct clinical 
differences. ET-NANB is characteristically an acute 
infection, often associated with fever and arthralgia, 

20 and with portal inflammation and associated bile 
stasis in liver biopsy specimens (Arankalle) . 
Symptoms are usually resolved within six weeks. 
Parenterally transmitted NANB, by contrast, produces a 
chronic infection in about 50% of the cases. Fever and 

25 arthralgia are rarely seen, and inflammation has a 

predominantly parenchymal distribution (Khuroo, 1980). 
The course of ET-NANBH is generally uneventful in 
healthy individuals, and the vast majority of those 
infected recover without the chrc ic sequelae seen 

30 with HCV. One peculiar epidemiologic feature of this 
disease, however, is the markedly high mortality 
observed in pregnant women; this is reported in 
numerous studies to be on the order of 10-20%. This 
finding has been seen in a number of epidemiologic 

35 studies but at present remains unexplained. Whether 
this reflects viral pathogenicity, the lethal 
consequence of the interaction of virus and immune 
suppressed (pregnant) host, or a reflection of the 
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debilitated prenatal health of a susceptible 
malnourished population remains to be clarified. 

The two viral agents can also be distin- 
guished on the basis of primate host susceptibility. 
5 ET-NANB, but not the parenterally transmitted agent, 
can be transmitted to cynomolgus monkeys . The 
parenterally transmitted agent is more readily 
transmitted to chimpanzees than is ET-NANB (Bradley, 
1987 ) . 

10 There have been major efforts worldwide to 

identify and clone viral genomic sequences associated 
with ET-NANB hepatitis. One goal of this effort, 
requiring virus-specific genomic sequences, is to 
identify and characterize the nature of the virus and 

15 its protein products. Another goal is to produce 
recombinant viral proteins which can be used in 
antibody-based diagnostic procedures and for a 
vaccine. Despite these efforts, viral sequences 
associated with ET-NANB hepatitis have not been 

2 0 successfully identified or cloned heretofore, nor have 
any virus-specific proteins been identified or 
produced . 

Relevant Literature 
25 Arankalle, V.A. , et al.. The Lancet, 550 

(March 12, 1988) . 

Bradley, D.W., et al., J Gen. Virol., 69:1 

( 1988) . 

Bradley, D.W. et al., Proc . Nat. Acad. Sci., 
30 USA, 84:6277 (1987). 

Gravelle, C.R. et al., J, Infect. Diseases, 
131 : 167 ( 1975) . 

Kane, M.A. , et al . , JAMA, 252:3140 (1984). 
Khuroo, M.S., Am . J . Med . , 48:818 (1980). 
35 Khuroo, M.S., et al . , Am. J. Med,, 68:818 

( 1983 ) . 
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Maniatis, T., et al . Molecular Cloning: A 
Laboratory Manual , Cold Spring Harbor Laboratory 
( 1982 ) . 

Seto, B./ et al., Lancet, 11:941 (1984). 
5 Sreenivasan, M.A., et al., J. Gen. Virol,, 

65 : 1005 ( 1984 ) . 

Tabor, E., et al . , J. Infect. Dis . , 140:789 

( 1979 ) . 

10 SUMMARY OF THE INVENTION 

Novel compositions, as well as methods of 
preparation ana use of the compositions are provided, 
where the compositions comprise viral proteins and 
fragments thereof derived from the viral agent for ET- 

15 NANB. A number of specific fragments of viral proteins 
(and the corresponding genetic sequences) that are 
particularly useful in diagnosis and vaccine 
production are also disclosed. Methods for preparation 
of ET-NANB viral proteins include isolating ET-NANB 

2 0 genomic sequences which are then cloned and expressed 
in a host cell. The resultant recombinant viral 
proteins find use as diagnostic agents and as 
vaccines. The genomic sequences and fragments thereof 
find use in preparing ET-NANB viral proteins and as 
2 5 probes for virus detection. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 shows vector constructions and 
manipulations used in obtaining and sequencing cloned 
30 ET-NANB fragment; and 

Figures 2A-2B are representations of 
Southern blots in which a radiolabeled ET-NANB probe 
was hybridized with amplified cDNA fragments prepared 
from RNA isolated from infected (I) and non-infected 
35 (N) bile sources {2A), and from infected (I) and non- 
infected (N) stool-sample sources (2B). 
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DESCRIPTION OF SPECIFIC EMBODIMENTS 
Novel compositions comprising generic 
sequences and fragments thereof derived from the viral 
agent for ET-NANB are provided, together with 
5 recombinant viral proteins produced using the genomic 
sequences and methods of using these compositions. 
Epitopes on the viral protein have been identified 
that are particularly useful in diagnosis and vaccine 
production. Small peptides containing the epitopes are 
10 recognized by multiple sera of patients infected with 
ET-NANB. 

The molecular cloning of HEV was accomp- 
lished by two very different approaches. The first 
successful identification of a molecular clone was 

15 based on the differential hybridization of putative 
HEV cDNA clones to heterogeneous cDNA from infected 
and uninfected cyno bile. cDNAs from both sources 
were labeled to high specific activity with P to 
identify a clone that hybridized specifically to the 

2 0 infected source probe. A cyno monkey infected with 
the Burma isolate of HEV was used in these first 
experiments. The sensitivity of this procedure is 
directly related to the relative abundance of the 
specific sequence against the overall background. In 

25 control experiments, it was found that specific 

identification of a target sequence may be obtained 
with as little as 1 specific part per 1000 background 
sequences. A number of clones were identified by this 
procedure using libraries and probes made from 

30 infected (Burma isolate) and control uninfected cyno 
bile. The first extensively characterized clone of 
the 16 plaques purified by this protocol was given the 
designation ETl . 1 . 

ETl.l was first characterized as both 

35 derived from and unique to the infected source cDNA. 
Heterogeneous cDNA was amplified from both infected 
and uninfected sources using a sequence independent 
single premier amplification technique (SISPA) . This 
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technique is described in copending application serial 
No. 208,512, filed June 17, 1988. The li-ited pool of 
cDNA made from Burma infected cyno bile could then be 
amplified enzymat ical ly prior to cloning or 
5 hybridization using putative HEV clones as probes. 
ETl.l hybridized specifically to the original bile 
cDNA from the infected source. Further validation of 
this clone as derived from the genome of HEV was 
demonstrated by the similarity of the ETl.l sequence 
10 and those present in SISPA cDNA prepared from five 
different human stool samples collected from 
different ET-NANBH epidemics including Somalia, 
Tashkent, Borneo, Mexico and Pakistan. These 
molecular epidemiologic studies established the 
15 isolated sequence as derived from the virus that 

represented the major cause of ET-NANBH worldwide. 

The viral specificity of ETl.l was further 
established by the finding that the clone hybridized 
specifically to RNA extracted from infected cyno 
20 liver. Hybridization analysis of polyadenylated RNA 
demonstrated a unique 7 . 5 Kb polyadenylated 
transcript not present in uninfected liver. The size 
of this transcript suggested that it represented the 
full length viral genome. Strand specific 
25 oligonucleotides were also used to probe viral genomic 
RNA extracted directly from semi-purified virions 
prepared from human stool. The strand specificity was 
based on the RNA-directed RNA polymerase (RDRP) open 
reading frcime (ORF) identified in ETl.l (see below). 
30 Only the probe detecting the sense strand hybridized 
to the nucleic acid. These studies characterized HEV 
as a plus sense, single stranded genome. Strand 
specific hybridization to RNA extracted from the liver 
also established that the vast majority of 
35 intracellular transcript was positive sense. Barring 
any novel mechanism for virus expression, the negative 
strand, although not detectable, would be present at a 
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ratio of less than 1:100 when compared with the sense 
strand . 

ETl.l was documented as exogenous when 
tested by both Southern blot hybridization and PGR 
5 using genomic DNAs derived from uninfected humans, 
infected and uninfected cynos and also the genomic 
DNAs from coli and various bacteriophage sources. 
The latter were tested in order to rule out trivial 
contamination with an exogenous sequence introduced 
10 during the numerous enzymatic manipulations performed 
during cDNA construction and amplification. It was 
also found that the nucleotide sequence of the ETl.l 
clone was not homologous to any entries in the 
Genebank database. The translated open reading frame 
15 of the ETl.l clone did, however, demonstrate limited 

homology with consensus amino acid residues consistent 
with an RNA-directed RNA polymerase. This consensus 
amino acid motif is shared among all positive strand 
RNA viruses and, as noted above, is present at the 3' 
20 end of the HCV genome. The 1,3 Kb clone was therefore 
presumed to be derived, at least in part, from the 
nonstructural portion of the viral genome. 

Because of the relationship of different 
strains of ET-NANB to each other that has been 
25 demonstrated by the present invention, the genome of 
the ET-NANB viral agent is defined in this 
specification as containing a region which is 
homologous to the 1.33 kb DNA EcoRI insert present in 
plasmid pTZKFl (ETl.l) carried in E^ coli strain BB4 
30 and having ATCC deposit no. 67717. The entire 

sequence, in both directions, has now been identified 
as set forth below. The sequences of both strands are 
provided, since both strands can encode proteins. 
However, the sequence in one direction has been 
35 designated as the "forward" sequence because of 

statistical similarities to known proteins and because 
the forward sequence is known to be predominately 
protein-encoding. This sequence is set forth below 
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along with the three possible translation sequences. 
There is one long open reading frame that starts at 
nucleotide 145 with an iscieucine and extends to the 
end of the sequence. The tvc other reading frames have 
5 many termination codons . Standard abbreviations for 
nucleotides and ammo aciis are used here and 
elsewhere in this specification. 

The gene sequence given below is 
substantially identical to one given in the parent 
10 application. The present sequence differs in the 

omission of the first 37 nucleotides at the 5' end and 
last 13 nucleotides at the 3' end, which are derived 
from the linker used for cloning rather than from the 
virus. In addition, a G was omitted at position 227 
15 of the sequence given in the parent application. 

The following gene sequence has SEQ ID NO.l; 
the first amino acid sequence in reading frame 
beginning with nucleotide 1 has SEQ ID NO. 2; the 
second amino acid sequence in reading frame beginning 
with nucleotide 2 has SEQ ID NO . 3 ; and the third amino 
acid sequence in reading frame beginning with 
nucleotide 3 has SEQ ID NO . 4 , 
Forward Sequence 

SEQ ID NO. : : 

25 

AGACCTGTCC CTGTTGCAGC TGTTCTACCA CCCTGCCCCG AGCTCGAACA GGGCCTTCTC 60 

TACCTGCCCC AGGAGCTCAC CACCTGTGAT AGTGTCGTAA CATTTGAATT AACAGACATT 120 

30 GTGCACTGCC GCATGGCCGC CCCGAGCCAG CGCAAGGCCG TGCTGTCCAC ACTCGTGGGC 180 

CGCTACGGCG GTCGCACAAA GCTCTACAAT GCTTCCCACT CTGATGTTCG CGACTCTCTC 240 

GCCCGTTTTA TCCCGGCCAT TGGCCCCG': CAGGTTACAA CTTGTGAATT GTACGAGCTA 300 

GTGGAGGCCA TGGTCGAGAA GGGCCAGG;- GGC^CCGCCG TCCTTGAGCT TGATCTTTGC 360 

AACCGTGACG TGTCCAGGAT CACCTTCT-C CAGAAAGATT GTAACAAGTT CACCACAGGT 420 

GAGACCATTG CCCATGGT:a AGTGGGCC^S GGCATCTCGG CCTGGAGCAA GACCTTCTGC 480 

gccctctttg gcccttggt^ ccgcgc^a- gagaaggcta ttctggccct gctccctcag 540 
ggtgtgtttt acggtgatg: cttgatgac accgt:ttct cggcggctgt ggccgcagca 6oo 

45 
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AAGGCATCCA TGGTGTTTGA GAATGAC^TT TCTGAGTTTG ACTCCACCCA GAATAACTTT 660 

TCTCTGGGTC TAGAGTGTGC "ATTATGGAG GAGTGTGGGA TGCCGCAGTG GCTCATCCGC 720 

CTGTATCACC TTATAAGGTC TGCGTGGATC TTGCAGGCCC CGAAGGAGTC TCTGCGAGGG 780 

TTTTGGAAGA AACAC^CCGG TGAGCCCGGC ACTCTTCTAT GGAATACTGT CTGGAATATG 840 

GCCGTTATTA CCCACTGTTA TGACTTCCGC GATTTTCAGG TGGCTGCCTT TAAAGGTGAT 900 

GATTCGATAG TGCTTTGCAG -TGAGTATCGT CAGAGTCCAG GAGCTGCTGT CCTGATCGCC 960 

GGCTGTGGCT TGAAGTTGAA GGTAGATTTC CGCCCGATCG GTTTGTATGC AGGTGTTGTG 1020 

GTGGCCCCCG GCCTTGGCGC GCTCCCTGAT GTTGTGCGCT TCGCCGGCCG GCTTACCGAG 1080 

AAGAATTGGG GCCCTGGCCC TGAGCGGGCG GAGCAGCTCC GCCTCGCTGT TAGTGATTTC 1140 

CTCCGCAAGC TCACGAATGT AGCTCAGATG TGTGTGGATG TTGTTTCCCG TGTTTATGGG 1200 

GTTTCCCCTG GACTCGTTCA TAACCTGATT GGCATGCTAC AGGCTGTTGC TGATGGCAAG 1260 

GCACATTTCA CTGAGTCAGT AAAACCAGTG CTCGA 1295 

SEQ ID NO. 2 : 

Arg Pro Val Pro Val Ala Ala Val Leu Pro Pro Cys Pro Glu Leu Glu 
15 10 15 

Gin Gly Leu Leu Tyr Leu Pro Gin Glu Leu Thr Thr Cys Asp Ser Val 
20 25 30 

Val Thr Phe Glu Leu Thr Asp He Val His Cys Arg Met Ala Ala Pro 
35 40 45 

Ser Gin Arg Lys Ala Val Leu Ser Thr Leu Val Gly Arg Tyr Gly Gly 
50 55 60 

Arg Thr Lys Leu Tyr Asn Ala Ser His Ser Asp Val Arg Asp Ser Leu 
65 70 75 80 

Ala Arg Phe He Pro Ala He Gly Pro Val Gin Val Thr Thr Cys Glu 
85 90 95 

Leu Tyr Glu Leu Val Glu Ala Met Val Glu Lys Gly Gin Asp Gly Ser 
100 105 110 

Ala Val Leu Glu Leu Asp Leu Cys Asn Arg Asp Val Ser Arg He Thr 
115 120 125 

Phe Phe Gin Lys Asp Cys Asn Lys Phe Thr Thr Gly Glu Thr He Ala 
130 135 140 

His Gly Lys Val Gly Gin Gly He Ser Ala Trp Ser Lys Thr Phe Cys 
145 150 155 160 



9. 
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Ala Leu Phe j'V P^o 



'a : 1 e :Vi u Lys Ala : 1 e Leu Ala 

' 175 



Leu Leu Pre Gin 3 ■ 



-s:: -^a -^^.e Asp Asp Thr Val 

190 



Phe Se^ i^a -^a Va^ -'3 -'a -"a .ys -la Se'- '^et Val Phe Glu Asn 

■C^ vnt; 



Asp Phe Se^ G'u -*".e -sv 'le'^ 



Asn Asn Phe Sen Leu Gly Leu 

22Z 



Glu Cy s H 1 a ; 1 e : 
225 



2rS 3'/ ^et Pro Gin Trp Leu lie Arg 
235 240 



Leu Tyr His Leu Me -rg Se-" - 'a '-p lie Leu Gin Ala Pro Lys Glu 
245 250 255 

Ser Leu Arg Gly Phe 'rp .ys s ^ts Ser Giy Glu Pro Gly Thr Leu 

260 255 270 

Leu Trp Asn Thr Va' ^rp ^sn ''e: Ala Val He Thr His Cys Tyr Asp 

275 150 285 

Phe Arg Asp Phe Gin val Ala Ala P^^e Lys Gly Asp Asp Ser lie Val 

290 295 300 

Leu Cys Ser Glu Tyr Arg Gin Ser -.'c Gly Ala Ala Val Leu lie Ala 
305 310 315 320 

Gly Cys Gly Leu Lys Leu Lys Vai Aso Phe Arg Pro He Gly Leu Tyr 
325 330 335 

Ala Gly Val Val Val Ala Pro Gly Leu Gly Ala Leu Pro Asp Val Val 
340 345 350 

Arg Phe Ala Gly Arg Leu Thr Glu Lys Asn Trp Gly Pro Gly Pro G1u 
355 360 365 

Arg Ala Glu Gin Leu Arg Leu Ala Val Ser Asp Phe Leu Arg Lys Leu 
370 375 380 

Thr Asn Val Ala Gin Met Cys Val Asp Val Val Ser Arg Val Tyr Gly 
385 390 395 400 

Val Ser Pro Gly Leu val His -sn leu lie Gly Met Leu Gin Ala Val 
405 410 415 

Ala Asp Gly Lys Ala ^^s Phe Thr Glu Ser Val Lys Pro Val Leu 
420 425 430 

SEQ ID NO. 3: 



Asp Leu Ser Leu Leu Gin ie'j -^e Tyr His Pro Ala Pro Ser Ser Asn 
15 10 15 



10. 



15 



30 



45 



Arq Ala Phe Ser Thr Cys Arg Ser Ser Pro Pro Val lie Val Ser 

20 25 30 

His Leu Asn . j^n Thr Leu Cys "hr Ala Ala Trp Pro Pro Arg 
5 35 40 45 

Ai.3 Se^ i'a ^r-: ^r- ^;,5 Cys ^--o ^^■.s Se^ Trp Ala Ala Thr Ala Val 

50 55 60 

10 A]3 G> Se^ Se- 'Vt Leu ^-c Th- Leu Met Phe Ala Thr Leu Ser 

65 ^0 75 80 

Pre Val Leu Se-* Arq P^o ^eu -la Pro Tyr Arg Leu Gin Leu Val Asn 

S5 90 95 



Cys Thr Ser . Trp Arg P^o Trp Ser Arg Arg Ala Arg Met Ala Pro 
ICO 105 110 



Pro Ser Leu Ser Leu He Phe Ala Thr Val Thr Cys Pro Gly Ser Pro 
20 115 120 125 

Ser Ser Arg Lys He Val Thr Ser Ser Pro Gin Val Arg Pro Leu Pro 
130 135 140 

25 Met Val Lys Trp Ala Arg Ala Ser Arg Pro Gly Ala Arg Pro Ser Ala 

145 150 155 160 



Pro Ser Leu Ala Leu Gly Ser Ala Leu Leu Arg Arg Leu Phe Trp Pro 

165 170 175 

Cys Ser Leu Arg Val Cys Phe Thr Val Met Pro Leu Met Thr Pro Ser 

180 185 190 



Ser Arg Arg Leu Trp Pro Gin Gin Arg His Pro Trp Cys Leu Arg Met 

35 195 200 205 

Thr Phe Leu Ser Leu Thr Pro Pro Arg lie Thr Phe Leu Trp Val . 

210 215 220 

40 Ser Val Leu Leu Trp Arg Ser Val Gly Cys Arg Ser Gly Ser Ser Ala 

225 230 235 240 



Cys He Thr Leu . Gly Leu Arg Gly Ser Cys Arg Pro Arg Arg Ser 
245 250 255 

Leu Cys Glu Gly Phe Gly Arg Asn Thr Pro Val Ser Pro Ala Leu Phe 
260 265 270 



Tyr Gly He Leu Ser Gly He Trp Pro Leu Leu Pro Thr Val Met Thr 

50 275 280 285 

Ser Ala He Phe Arg Trp Leu Pro Leu Lys Val Met He Arg . Cys 

290 295 300 

55 Phe Ala Val Ser He Val Arg Val Gin Glu Leu Leu Ser . Ser Pro 

305 310 315 320 
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Ala ya I 3 . ^^e- 



Se^ Ai3 Arg Ser Val Cys Met 
33C 335 



Gin Val Leu Trp 'r: 



^ro Ala ^ej Ala Arg Ser Leu Met Leu Cys 
3^5 350 



Ar"c "T:: !"e Gl'. A 'a L2u Ala Leu Ser 
36: 365 



:lv ^r:!; Se^ 5?*^ Ss'' 



^e^ .e.. ^e^ /a' !-e Se-" Ser Ala Ser Ser 



Arg ^et . Lej -r- c . s .ai "et ^ei. -he --o Val Phe Met Gly 
3S5 39: 395 400 

Phe Pro Leu Se^ ^*"e '^.r lpu Ala Cys Tyr Arg Leu Leu 
-^05 413 415 

Leu Met Ala Arg His lie Ser Leu Ser Gin . Asn Gin Cys Ser 
420 -25 430 

SEQ ID NO. 4 : 

Thr Cys Pro Cys Cys Ser Cys Ser ^hr "hr Leu Pro Arg Ala Arg Thr 
15 10 15 

Gly Pro Ser Leu Pro Ala Pro Gly Ala His His Leu . . Cys Arg 
20 25 30 

Asn lie . He Asn Arg His Cys Ala Leu Pro His Gly Arg Pro Glu 
35 40 45 

Pro Ala Gin Gly Arg Ala Val His Thr Arg Gly Pro Leu Arg Arg Ser 
50 55 60 

His Lys Ala Leu Gin Cys Phe Pro Leu . Cys Ser Arg Leu Ser Arg 
65 70 75 80 

Pro Phe Tyr Pro Gly His Trp Pro Arg ^hr Gly Tyr Asn Leu . He 
85 90 95 

Val Arg Ala Ser Gly Gly His Gly Arg Glu Gly Pro Gly Trp Leu Arg 

100 105 no 

Arg Pro , Ala . Ser Leu Gin Pro . Arg Val Gin Asp His Leu 
115 120 125 

Leu Pro Glu Arg Leu . Gin Val His His Arg . Asp His Cys Pro 
130 135 140 

Trp . Ser Gly Pro Gly His Leu Gly Leu Glu Gin Asp Leu Leu Arg 
145 150 155 160 



Pro Leu Trp Pro Leu Val Pro Arg Tyr . Glu Gly Tyr Ser Gly Pro 
165 170 175 



12. 
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Ala Pre Ser 3'; Zy^ .a^ .eu . Cys .eu . , His Arg Leu 

1£C 185 190 

Leu j-y G'V Cys G\, -^z Se'" ..s G'y ^=is Gly Val . Glu . 
195 :C0 ^ 205 

Lc^ '^c . .2' .e.: ""s G''^ Leu Pne Sei" G'v Ser Arg 
:iO 215 220 

Val :ys Ty^ 'y^ G'y G^y Val ^r- asd Ala Ala Val Ala His Pro Pro 
225 IjC 235 240 



Va! jer Pro ^v-^ Lys -a' Cvs ,ai Asp Leu Ala Gly Pro Glu Gly Val 

15 245 250 255 

Ser Ala Arg Val Leu G-u Glu ^hr Leu Arg . Ala Arg His Ser Ser 

260 265 270 

20 Met Glu Tyr Cys Leu Glu '^yr Gly Arg Tyr Tyr Pro Leu Leu . Leu 

275 280 285 



Pro Arg Phe Se- Gly Gly Cys .eu . Arg . . phe Asp Ser Ala 

290 295 300 

Leu Gin . Val Ser Ser Glu Ser Arg Ser Cys Cys Pro Asp Arg Arg 

305 310 315 320 



Leu Trp Leu Glu Val Glu Gly Arg Phe Pro Pro Asp Arg Phe Val Cys 

30 325 330 335 

Arg Cys Cys Gly Gly Pro Arg Pro Trp Arg Ala Pro . Cys Cys Ala 

340 345 350 

35 Leu Arg Arg Pro Ala Tyr Arg Glu Glu Leu Gly Pro Trp Pro . Ala 
355 360 365 



Gly Gly Ala Ala Pro Pro Arg Cys . . Phe Pro Pro Gin Ala His 

370 375 380 

Glu Cys Ser Ser Asp Val Cys Gly Cys Cys Phe Pro Cys Leu Trp Gly 

385 390 395 400 



Phe Pro Trp Thr Arg Ser . Pro Asp Trp His Ala Thr Gly Cys Cys 

45 405 410 415 

. Trp Gin Gly Thr Phe His . Val Ser Lys Thr Ser Ala Arg 

420 425 430 



The complementary strand, referred to here 
as the "reverse sequence," is set forth below in the 
same manner as the forward sequence set forth above. 
Several open reading frames, shorter than the long 
open reading frame found in the forward sequence, can 
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be seen in this reverse sequence. Because of the 
relative brevity of the open reading frames in the 
reverse direction, they are probably not expressed. 
The following gene sequence has SEQ ID NO . 5 » 
5 Reverse Sequence 







J . - . 


- ■ " 




I lAGl^^Av, 


AGCCTGTAGC 


60 


10 


" Tr ^ • • T r * 

M 1 O L. _ M V. " 








- A A C A C G G G A 


AACAACATCC 


120 




ACACACATCT 


\jr^O^ . ■ 


„ J ■ ^ 1 \, . U 




CACTAACAGC 


GAGGCGGAGC 


180 


15 


TGCTCCGCCC 




'^yzo^j'^ C C '..A A 


^TCTTCTCGG 


TAAGCCGGCC 


GGCGAAGCGC 


240 


ACAACATCAG 


GGAGCGCGCC 


AAGGCCGGGG 


GCCACCACAA 


CACCTGCATA 


CAAACCGATC 


300 




GGGCGGAAAT 


CTACCTTCAA 


CTTCAAGCCA 


CAGCCGGCGA 


TCAGGACAGC 


AGCTCCTGGA 


360 


20 


CTCTGACGAT 


ACTCACTGCA 


AAGCAC7ATC 


GAATCATCAC 


CTTTAAAGGC 


AGCCACCTGA 


420 




AAATCGCGGA 


AGTCATAACA 


GTGGGTAATA 


ACGGCCATAT 


TCCAGACAGT 


ATTCCATAGA 


480 


25 


AGAGTGCCGG 


GCTCACCGGA 


GTGTTTCTTC 


CAAA^iCCCTC 


GCAGAGACTC 


CTTCGG6GCC 


540 


TGCAAGATCC 


ACGCAGACCT 


TATAAGGTGA 


TACAGGCGGA 


TGA6CCACTG 


CGGCATCCCA 


600 




CACTCCTCCA 


TAATAGCACA 


CTCTAGACCC 


AGAGAA AAGT 


TATTCTGGGT 


GGAGTCAAAC 


660 


30 


TCAGAAAAGT 


CATTCTCAAA 


CACCATGGAT 


GCC-TTGCTG 


CGGCCACAGC 


CGCCGAGAAG 


720 




ACGGTGTCAT 


CAAAGGCATC 


ACCGTAAAAC 


ACACCCTGAG 


GGAGCAGGGC 


CAGAATAGCC 


780 


35 


TTCTCAATAG 


CGCGGAACCA 


AGGGCCAAAG 


AGGGCGCAGA 


AGGTCTTGCT 


CCAGGCCGAG 


840 


ATGCCCTGGC 


CCACTTTACC 


ATGGGCAATG 


GTCTCACCTG 


TGGTGAACTT 


GTTACAATCT 


900 




TTCTGGAAGA 


AGGTGATCCT 


GGACACGTCA 


CGGTTGCAAA 


GATCAAGCTC 


AAGGACGGCG 


960 


40 


GAGCCATCCT 


6GCCCTTCTC 


GACCATGGCC 


TCCACTAGCT 


CGTACAATTC 


ACAAGHGTA 


1020 




ACCTGTACGG 


GGCCAATGGC 


CGGGATAAAA 


CGGGCGAGAG 


A6TCGCGAAC 


ATCAGAGTGG 


1080 


45 


GAAGCATTGT 


AGAGCTTTGT 


GCGACCGCCG 


TAGCGGCCCA 


CGAGTGTGGA 


CAGCACGGCC 


1140 


TTGCGCTGGC 


TC6GGGCGGC 


CATGCGGCAG 


TGCACAATGT 


CTGTTAATTC 


AAATGTTACG 


1200 




ACACTATCAC 


AGGTGGTGAG 


CTCCTGGGGC 


AGGTAGAGAA 


GGCCCTGTTC 


GAGCTCG6GG 


1260 


50 


CAGGGTGGTA 


GAACAGCTGC 


A ACAGGG AC A 


GGTCT 






1295 



Identity of this sequence with sequences in 
etiologic agents has been confirmed by locating a 
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corresponding sequence in a viral strain isolated in 
Burma. The Burmese isolate contains the following 
sequence of nucleotides (one strand and open reading 
frames shown). The following gene sequence has SEQ ID 
5 NO . 6 ; the protein sequence corresponding to ORFl has 

SEQ ID NO. 7; 0RF2 has SEQ ID NO . 8 ; and 0RF3 has SEQ ID 
NO. 9 . 



SEQjENCE of HEV (BURMA STRAIN) 
10 -ORFl--> 

MEAriQFlKAPG 
AGGCAGACCACATATGTGGTCGATGCCATGGAGGCCCATCAGTTTATTAAGGCTCCTGGC 

ITTAIEQAALAAANSALANA 
15 ATCACTACTGCTATTGAGCAGGCTGC^CTAGCAGCGGCCAACTCTGCCCTGGCGAATGCT 120 

VVVRPFLSHOOIEILINLMQ 
GTGGTAGTTAGGCCTTTTCTCTCTCACCAGCAGATTGAGATCCTCATTAACCTAATGCAA 

20 PRQLVFRPEVFWNHPIQRVI 

CCTCGCCAGCTTGTTTTCCGCCCCGAGGTTTTCTGGAATCATCCCATCCAGCGTGTCATC 240 

HNELELYCRARSGRCLEIGA 
CATAACGAGCTGGAGCTTTACTGCCGCGCCCGCTCCGGCCGCTGTCTTGAAATTGGCGCC 



25 



40 



HPRSINONPNVVHRCFLRPV 
CATCCCCGCTCAATAAATGATAATCCTAATGTGGTCCACCGCTGCTTCCTCCGCCCTGTT 360 



GROVQRWYTAPTRGPAANCR 
30 G6GCGTGATGTTCAGCGCTGGTATACTGCTCCCACTCGCGGGCCGGCTGCTAATTGCCGG 

RSALRGLPAAORTYCLDGFS 
CGTTCCGCGCTGCGCGGGCTTCCCGCTGCTGACCGCACTTACTGCCTCGAC6GGTTTTCT 480 

35 GCNFPAETGIALYSLHDMSP 
6GCTGTAACTTTCCCGCCGAGACTGGCATCGCCCTCTACTCCCTTCATGATATGTCACCA 



SDVAEAMFRHGMTRLYAALH 
TCTGATGTCGCCGAGGCCATGTTCCGCCATGGTATGACGCGGCTCTATGCCGCCCTCCAT 600 

LPPEVLLPPGTYRTASYLLI 
CTTCCGCCTGAGGTCCTGCTGCCCCCTGGCACATATCGCACCGCATCGTATTTGCTAATT 



HDGRRVVVTYEGDTSAGYNH 
45 CATGACGGTAGGCGCGTTGTGGTGACGTATGAGGGTGATACTAGTGCTGGTTACAACCAC 720 

DVSNLRSWIRTTKVTGDHPL 
GATGTCTCCAACTTGCGCTCCTGGATTAGAACCACCAAGGTTACCGGAGACCATCCCCTC 

50 VIERVRAIGCHFVLLLTAAP 

GTTATCGAGCGGGTTAGGGCCATTGGCTGCCACTTTGTTCTCTTGCTCACGGCAGCCCCG 840 



55 



EPSPMPYVPYPRSTEVYVRS 
GAGCCATCACCTATGCCTTATGTTCCTTACCCCCGGTCTACCGAGGTCTATGTCCGATCG 
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IFGPGG^^S.^P^rCSTKST 

atcttcggcccggg^ggcac:::^^::':m'v:::aa::':atgctccactaagtcgacc 960 

FHAy::' ^"!w:;_M^FGATLD 
TTCCATGC"G^CCCTGCCC^^:.T^TGG3-C:3^C^^ATGC^GTTCGGGGCCACCTTGGAT 

D g A F C C S P _ ^' ^ ' . r G : S r K V 
GACCAAGCC^TTTGCGCrT'^G^^'A-^G-CrACCTTCGCGGCATTAGCTACAAGGTC 1080 

T V G ^ . - N E G -1 N ^ S. E D A L T A 
ACTGTTGGTACCC^^GTGGrAATGAAGGC'GGAA^GCCTCTGAGGACGCCCTCACAGCT 

V I T A A T ^ ^ : C ^ C ' L ^! T g A I 

GTTATCACTGCCGCCTACC"ACCA'^TGCCACCAGCGGTATCTCCGCACCCAGGCTATA 1200 

SKGMRPLE^E-A QvFITRLY 
TCCAAGGGGATGCGTCGTC^GGAACGGG-GCA^GCCCAGAAGTTTATAACACGCCTCTAC 

S W L F E K S G 0 ' : P G R g L E F Y 
AGCTGGCTCTTCGAGAAGTCCGGCCGTGATTACATCCCTGGCCGTCAGTTGGAGTTCTAC 1320 

AQCRRWLSAGFHLDPRVLVF 
GCCCAGTGCAGGCGCTGGCTCTCCGCCGGCTTTCATCTTGATCCACGGGTGTTGGTTTTT 

OESAPCHCRTAIRKALSKFC 
GACGAGTCGGCCCCCTGCCATTGTAGGACCGCGATCCGTAAGGCGCTCTCAAAGTTTTGC 1440 

C F M K W L G g E C ^ C L 0 P A E G A 
TGCTTCATGAAGTGGCTTGG'CAGGAGTGCACCTGCTTCCTTCAGCCTGCAGAAGGCGCC 

VGOgGHDNEAYEGSDVDPAE 
GTCGGCGACCAGGGTCATGATAATGAAGCCTATGAGGGGTCCGATGTTGACCCTGCTGAG 1560 

SAISDISGSYVVPGTALQPL 
TCCGCCATTAGTGACATATCTGGGTCCTATGTCG-rCCCTGGCACTGCCCTCCAACCGCTC 

YQALDLPAEIVARAGRLTAT 
TACCAGGCCCTCGATCTCCCCGCTGAGATTGTGGCTCGCGCGGGCCGGCTGACCGCCACA 1680 

VKVSQVDGRIDCETLLGNKT 
GTAAAGGTCTCCCAGGTCGATGGGCGGATCGATTGCGAGACCCTTCTTGGTAACAAAACC 

FRTSFVOGAVLETNGPERHN 
TTTCGCACGTCGTTCGTTGACGGGGCGGTCTTAGAGACCAATGGCCCAGAGCGCCACAAT 1800 

LSFOAS05TMAAGPFSLTYA 
CTCTCCTTCGATGCCAGTCAGAGCACTATGGCCGCTGGCCCTTTCAGTCTCACCTATGCC 

ASAAGLEVRYVAAGLDHRAV 
GCCTCTGCAGCTGGGCTGGAGGTGCGCTATGTTGCTGCCGGGCTTGACCATCGGGCGGTT 1920 

FAPGVSPRSAPGEVTAFCSA 
TTTGCCCCCGGTGTTTCACCCCGGTCAGCCCCCGGCGAGGTTACCGCCTTCTGCTCTGCC 

LYRFNREAgRHSLIGNLWFH 
CTATACAGGTTTAACCGTGAGGCCCAGCGCCATTCGCTGATCGGTAACTTATGGTTCCAT 2040 

PEGLIGLFAPFSPGHVWESA 
CCTGAGGGACTCATTGGCCTCTTCGCCCCGTTTTCGCCCGGGCATGTTTGGGAGTCGGCT 
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NPFCGES^.T^RTWSEVDAV 
AATCCATTCTGTGGCG;Gagcaca::^'ACACCCGTACTTGGTCGGAGGTTGATGCCGTC 2160 

SSPARPDLG'^MSEPSIPSRA 
TCTAGTCCAGCCCGGCC^GAC—AGGTTTT-ATGTCTGAGCCTTCTATACCTAGTAGGGCC 

atptl--p.?ppapopsppp 

GCCACGCCTiCCC"G:GGrrrr-C^AC:::CCCCTGCACCGGACCCTTCCCCCCCTCCC 2280 

SAPA.AE^ASGATAGAPAIT 
TCTGCCCCGGCGC"TGC^GAGC:GGC"C'GGCGCTACCGCCGGGGCCCCGGCCATAACT 

H 0 T A K ^ :< . . T y P D G S K V F 
CACCAGACGGCCCGGCACCGC:g:C'GC'C^TCACCTACCCGGATGGCTCTAAGGTATTC 2400 

AGSLf^ES^C'WLVNASNVDH 
GCCGGCTCGCTGTTCGAGTCGACATGCACGTGGCTCGTTAACGCGTCTAATGTTGACCAC 

RPGGGLCHAFYQRYPASFDA 
CGCCCTGGCGGCGGGCTTTGCCATGCATTTTACCAAAGGTACCCCGCCTCCTTTGATGCT 2520 

ASFVMRDGAAAYTLTPRPII 
GCCTCTTTTGTGATGCGCGACGGCGCGGCCGCGTACACACTAACCCCCCGGCCAATAATT 

HAVAPDYR'lEHNPKRLEAAY 
CACGCTGTCGCCCCTGATTATAGGTTGGAACATAACCCAAAGAGGCTTGAGGCTGCTTAT 2640 

retcsrlgtaaypllgtgiy 
cgggaaacttgctcccgcctcggcaccgctgcatacccgctcctcgggaccggcatatac 

Qvpigpsfdawernhrpgoe 

CAGGTGCCGATCGGCCCCAGTTTTGACGCCTGGGAGCGGAACCACCGCCCCGGGGATGAG 2760 

lylpelaarwfeanrptrpt 

TTGTACCTTCCTGAGCTTGCTGCCAGATGGTTTGAGGCCAATAGGCCGACCCGCCCGACT 

ltiteovartanlaielosa 

CTCACTATAACTGAGGATGTTGCACGGACAGCGAATCTGGCCATCGAGCTTGACTCAGCC 2880 

tovgracagcrvtpgvvqyq 
acagatgtcggccgggcctgtgccggctgtcgggtcacccccggcgttgttcagtaccag 

ftagvpgsgksrsitqaovo 

TTTACTGCAGGTGTGCCTGGATCCGGCAAGTCCCGCTCTATCACCCAAGCCGATGTGGAC 3000 

vvvvptrelrnawrrrgfaa 
gttgtcgtggtcccgacgcgtgagttgcgtaatgcctggcgccgtcgcggctttgctgct 

ftphtaarvtqgrrvvidea 

TTTACCCCGCATACTGCCGCCAGAGTCACCCAGGGGCGCCGGGTTGTCATTGATGAGGCT 3120 

PSLPPHLLLLHHqRAATVHL 
CCATCCCTCCCCCCTCACCTGCTGCTGCTCCACATGCAGCGGGCCGCCACCGTCCACCTT 

lgdpnqipaidfehaglvpa 

CTTGGCGACCCGAACCAGATCCCAGCCATCGACTTTGAGCACGCTGGGCTCGTCCCCGCC 3240 

irpolgptsw whvthrwpad 
atcaggcccgacttaggccccacctcctggtggcatgttacccatcgctggcctgcggat 
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vcel!rg" '-^:ottsrvlr 
gtatgcgag:^c-tc:g'".g'gC'^'ac:::^tg^'cca:accac"agccgggttctccgt 3360 

s l f w g e 2 ^ v 3 q k l v f t q a a k 

TCGTTG^^C'GGGGTGAGC C^GCCGTCGGGI -GA AAC'A j'GTTCACCCAGGCGGCCAAG 
P A N P G S ^ : V - E A 0 G ^ ^ ^ E T 

cccg:caac:::ggc^c-r3":ggk:-c3'^g3cgc-3ggcgc'acctacacggagacc 3480 
^ : :a-;:::g.:qssrahai 

ACTATTA ■ , GCCACAGlAGA^o^^CG'O'jG^." ■-"^ ■ CAG'CGTCTCGGGCTCATGCCATT 

val^k-'e*^^.:::a?gllr 
gttgc^ctgacgegcc-^cac'g-gaagt^cg'c-^'ca — gacgcaccaggcctgcttcgc 3600 

E V G : 5 J - : V s N - ^ . A G G E I G 
GAGGTGGGCATCTCCGATGrAA':G':«A-A:,:'-'-:,;:cGCTGG^GGCGAAATTGGT 

H 0 R p s : : ^ p :^ s A s ,f D T L A 

caccauCg:ccatcagt^at':cccgtgg:a:c:ctgacgc:aatgttgacaccctggct 3720 

afppsc()isafholaeelgh 
gccttcccgccgtcttgccagattagtgcr^ccatcagttggctgaggagcttggccac 

rpvp vaavlp?::?eleogll 
agacctgtccctgttgcagctgttctaccaccctgccccgagctcgaacagggccttctc 3840 

ylpqel^rcdsvvtfeltdi 
tacctgccccaggagc^caccacctgtgatagtgtcgtaacatttgaattaacagacatt 

vhcrmaapsqrkavlstlvg 

GTGCACTGCCGCATGGCCGCCCCGAGCCAGCGCAAGGCCGTGCTGTCCACACTCGTGGGC 3960 

RYGGRTKLYNAShSOVRDSL 
CGCTACGGCGGTCGCACAAAGCTCTACAATGCTTCCCACTCTGATGTTCGCGACTCTCTC 

ARFIPAIGPVQVTTCELYEL 
GCCCGTTTTATCCCGGCCATTGGCCCCGTACAGGTTACAACTTGTGAATTGTACGAGCTA 4080 

VEAMVEKGQOGSAVLELOLC 
GTGGAGGCCATGGTCGAGAAGGGCCAGGATGGCTCCGCCGTCCTTGAGCTTGATCTTTGC 

NROVSRITFFQ<DCNKFTTG 
AACCGTGACGTGTCCAGGATCACCTTCTTCCAGAAAGATTGTAACAAGTTCACCACAGGT 4200 

ETIAHGKVGQGIb'- SKTFC 
GA6ACCATTGCCCATGGTAAAGTGGGCCAGGGCATCTCGGCCTGGAGCAAGACCTTCTGC 

ALFGPWFRA:EK, AILALLPQ 
GCCCTCTTTGGCCCTTGGTTCCGCGCTATTGAGAAGGCTATTCTGGCCCTGCTCCCTCAG 4320 

GVFYGDAFDOTVFSAAVAAA 
GGTGTGTTTTACGGTGATGCCTTTGATGACACCGTCTTCTCGGCGGCTGTGGCCGCAGCA 

KASHVFENDFSEFDSTQNNF 
AAGGCATCCATGGTG7TTGAGAATGACTTTTCTGAGTTTGACTCCACCCAGAATAACTTT 4440 

SLGLECAIMEECGHPQWLIR 
TCTCTGGGTCTAGAG'GTGCTATTATGGAGGAGTGTGGGATGCCGCAGTGGCTCATCCGC 



18. 





LYHLIRSAWILQAPKESLRG 
CTGTATCACCTTATAAGGTCTGCGTGGATCTTGCAGGCCCCGAAGGAGTCTCTGCGAGGG 4560 

FWKKHSGEPGTLLWNTVWNM 

5 TTTTGGAAGAAACACTCCGGTGAGCCCGGCACTCTTCTATGGAATACTGTCTGGAATATG 

AVITHCYDFRDFQVAAFKGD 
GCCGTTATTACCCACTGTTATGACTTCCGCGATTTTCAGGTGGCTGCCTTTAAAGGTGAT 4680 

IC 05IVLCSEYRQSPGAAVLIA 
GATTCGATAGTGCTTTGCAGTGAGTATCGTCAGAGTCCAGGAGCTGCTGTCCTGATCGCC 

GCGLKLKVDFRPIGLYAGVV 
GGCTGTGGCTTGAAGTTGAAGGTAGATTTCCGCCCGATCGGTTTGTATGCAGGTGTTGTG 4800 
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VAPGLGALPOVVRFAGRLTE 
GTGGCCCCCGGCCTTGGCGCGCTCCCTGATGTTGTGCGCTTCGCCGGCCGGCTTACCGAG 



KNWGPGPERAEQLRLAVSDF 
20 AAGAATTGGGGCCCTGGCCCTGAGCGGGCGGAGCAGCTCCGCCTCGCTGTTAGTGATTTC 4920 

LRKLTNVAQMCVDVVSRVYG 
CTCCGCAAGCTCACGAATGTAGCTCAGATGTGTGTGGATGTTGTTTCCCGTGTTTATGGG 

25 VSPGLVHNLIGMLQAVADGK 

GTTTCCCCTGGACTCGTTCATAACCTGATTGGCATGCTACAGGCTGTTGCTGATGGCAAG 5040 



AHFTESVKPVLlDLTNSILCR 
30 GCACATTTCACTGAGTCAGTAAAACCAGTGCTCGACTTGACAAATTCAATCTTGTGTCGG 

|-0RF3---> 

MNNMSFAAPMGSRPCALG 

M R P R P 

35 V E Z |-0RF2--> 

GTGGAATGAATAACATGTCTTTTGCTGCGCCCATGGGTTCGCGACCATGCGCCCTCGGCC 5160 

LFCCCSSCFCLCCPRHRPVS 
ILLLLLMFLPMLPAPPPGQP 

TAnTTGrTGCTGCTCCTCATGTTTTTGCCTATGCTGCCCGCGCCACCGCCCGGTCAGCC 

RLAAVVGGAAAVPAVVSGVT 
SGRRRGRRSGGSGGGFWGDR 

GTCTGGCCGCCGTCGTGGGCGGCGCAGCGGCGGTTCCGGCGGT6GTTTCTGGGGTGACCG 5280 

GLILSPSQSPIFIQPTPSPP 
VDSQPFAIPYIHPTNPFAPO 

GGTTGATTCTCAGCCCTTCGCAATCCCCTATATTCATCCAACCAACCCCTTCGCCCCCGA 

MSPLRPGLDLVFANPPDHSA 
VTAAAGAGPRVRQPARPLGS 

TGTCACCGCTGCGGCCGGGGCTGGACCTCGTGTTCGCCAACCCGCCCGACCACTCGGCTC 5400 
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P L G V ^ K P S " ^ . - M V 7 0 L P Q 

A W R J : G - ^"^ , A S P p p P T T A 

cgcttggcgtgaccagg::cag:g::::3::g''gc:'cacgtcgtagacctaccacagc 
L G p p R : 

GAAP '^-:.-^:mjtppvPDV 
"ggggccgcgccgc'aaci gzg j'cgc ' - j " atgac accc cgccagtgcctgatgt 5520 

s p G : . = ^ , • N . s r s p L T s 



cg'^c"cccgcggcg:ca^:^'^:3c:':::"g'"'a:c::aicaacatctccccrtacctc 

15 s v a ^ g ' n : . . ^ a a p l s p l l p 

ttccgtggccaccggcactaac:tgg:t:"tatgccgcccctcttagtccgcttttacc 5640 
lqdg^n^himateasnyaqy 

20 

CCTTCAGGACGGCACCAATACCCATATAATGGCCACdGAAGCTTCTAATTATGCCCAGTA 

RVARATIP-PPLVPNAVGGY 

25 CCGGGTTGCCCGTGCCACAATCCG'"ac:gCCCGCT3GTCCCCAATGCTGTCGGCGGTTA 5760 

AISIS'^WPQTT-IPTSVOMN 



cgccatctccatctcattctggccacagaccaccaccaccccgacgtccgttgatatgaa 
sitstdvrilvqpgiaselv 

TTCAATAACCTCGACGGATGTTCGTATTTTAGTCCAGCCCGGCATAGCCTCTGAGCTTGT 5880 
35 IPSERLHfRNQG WRSVETSG 

GATCCCAAGTGAGCGCCTACACTATCGTAACCAAGGCTGGCGCTCCGTCGAGACCTCTGG 
VAEEEATSGLVMLCIHGSLV 

40 

GGTGGCTGAGGAGGAGGCTACCTCTGGTCTTGTTATGCTTTGCATACATGGCTCACTCGT 6000 
NSYTNTPYTGALGLLDFALE 

45 AAATTCCTATACTAATACACCCTATACCGGTGCCCTCGGGCTGTTGGACTTTGCCCTTGA 

LEFRNLTPGNTNTRVSRYSS 



GCTTGAGTTTCGCAACCTTACC:CCGGTAACACCAATACGCGGGTCTCCCGTTATTCCAG 6120 
TARHRLRRGADGTAELTTTA 



CACTGCTCGCCACCGCCTICGTCGCGGTGCGGACGGGACTGCCGAGCTCACCACCACGGC 
55 ATRFM<DL^F^STNGVGEIG 

TGCTACCCGCTTTATGAAGGACC'CTAT— AC';gTACTAATGGTGTCGGTGAGATCGG 6240 
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CCGCGGGATAGCCCTCACCCTGTTCAACCTTGCTGACACTCTGCTTGGCGGCCTGCCGAC 

ELISSAG30^Ft SRPVVSAN 

AGAATTGA—'CG'CGGC^GG^GGCCAGC^GTTCTACTCCCGTCCCGTTGTCTCAGCCAA 6360 
G t ? T . ^ , T ^ E N A 0 0 D K G I 

TGGCGAGCCGAC^GTTAAGTTGTATACATC^GTAGAGAATGCTCAGCAGGATAAGGGTAT 

A I P H D : D L G E S R V V I Q D Y D N 

TGCAAT:CCGCATG:CA"3ACCT:GGAGAATCTCGTGTGGTTATTCAGGATTATGATAA 6480 
QHE0: = ^"^ 5PAPSRPFSVL 

ccaacatgaacaaga^cggc:ga:gc::t^ctccagccccatcgcgccctttctctgtcct 

RANDVLWLSLTAAEYDQSTY 
TCGAGCTAATGATGTGCTTTGGCTCTCTCTCACCGCTGCCGAGTATGACCAGTCCACTTA 6600 

gsstgpvyvsosvtlvnvat 
tggctcttcgactggcccagtttatgtttctgactctgtgaccttggttaatgttgcgac 
gaqavarsldwtkvtlogrp 

CGGCGCGCAGGCCGTTGCCCGGTCGCTCGATTGGACCAAGGTCACACTTGACGGTCGCCC 6720 
LSTIQQYSKTFFVLPLRGKL 

CCTCTCCACCATCCAGCAGTACTCGAAGACCTTCTTTGTCCTGCCGCTCCGCGGTAAGCT 

sfweagttkagypynyntta 

CTCTTTCTGGGAGGCAGGCACAACTAAAGCCGGGTACCCTTATAATTATAACACCACTGC 6840 

sdqllvenaaghrvaistyt 
tagcgaccaactgcttgtcgagaatgccgccgggcaccgggtcgctatttccacttacac 
tslgagpvsisavavlaphs 

CACTAGCCTGGGTGCTGGTCCCGTCTCCATTTCTGCGGTTGCCGTTTTAGCCCCCCACTC 6960 

alalledtldyparahtfdd 
tgcgctagcattgcttgaggataccttggactaccctgcccgcgcccatacttttgatga 
fcpecrplglqgcafqstva 

TTTCTGCCCAGAGTGCCGCCCCCTTGGCCTTCAGGGCTGCGCTTTCCAGTCTACTGTCGC 7080 

elqrlkmkvgktrelz 

TGAGCTTCAGCGCCTTAAGATGAAGGTGGGTAAAACTCGGGAGTTGTAGmATTTGCTT 



21. 




GTGCCCCCCTTCTTTC^GT-GC"iT-':'.:-T^rGCGTTCCGCGCTCCCTGA 7195 

Total number of bases in this sequence as 
presented is 7195. The poly-A tail present in the 
5 cloned sequence has been omitted. 

The ability of the methods described herein to 
isolate and identify genetic material from other NANB 
hepatitis strains has been confirmed by identifying 
genetic material from an isolate obtained in Mexico. 

10 The sequence of this isolate was about 75% identical 
to the ETl.l sequence set forth in SEQ ID NO . 1 above. 
The sequence was identified by hybridization using the 
conditions set forth in Section II. B below. 

In this different approach to isolation of the 

15 virus, cDNA libraries were made directly from a semi- 
purified human stool specimen collected from an 
outbreak of ET-NANB in Telixtac. The recovery of 
cDNA and the construction of representative libraries 
was assured by the application of sequence independent 

20 single premier amplification (SISPA). A cDNA library 
constructed in lambda gtll from such an amplified cDNA 
population was screened with a serum considered to 
have "high" titer anti-HEV antibodies as assayed by 
direct immunofluorescence on liver sections from 

25 infected cynos . Two cDNA clones, denoted 406.3-2 and 
406.4-2, were identified by this approach from a total 
of 60,000 screened. The sequence of these clones was 
subsequently localized to the 3' half of the viral 
genome by homology comparison to the HEV (Burma) 

30 sequence obtained from clones isolated by 

hybridization screening of libraries with the 
original ETl.l clone. 

These isolated cDNA epitopes when used as 
hybridization probes on Northern blots of RNA 

35 extracted from infected cyno liver gave a somewhat 

different result when compared to the Northern blots 
obtained with the ETl.l probe. In addition to the 
single 7.5 Kb transcript seen using ETl.l, two 
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additional transcripts cf 3.7 and 2.0 Kb were 
identified using either of these epitopes as 
hybridization probes. These polyadenylated 
transcripts were identified using the extreme 3' end 
5 epitope clone (406.3-2) as probe and therefore 

established these crdr.sc i ipts as co- terminal with the 
3' end of the gencrr.e (see below). One of the epitope 
clones (406.4-2; was subsequently shown to react in a 
specific fashion with antisera collected from 5 

10 different geographic epidemics (Somalia, Burma, 

Mexico, Tashkent and Pakistan). The 406.3-2 clone 
reacted v;ith sera from 4 out of these same 5 epidemics 
(Yarbough et al., 1990). Both clones reacted with 
only post inoculation antisera from infected cynos . 

15 The latter experiment confirmed that seroconversion in 
experimentally infected cynos was related to the 
isolated exogenous cloned sequence. 



A composite cDNA sequence (obtained from several 
clones of the Mexican strain) is set forth below. 
20 Composite Mexico strain sequence (SEQ ID NO. 10): 





SEQ ID NO. 


10: 














GCCATGGAGG 


CCCACCAGTT 


CATTAAGGCT 


CCTGGCATCA 


CTACTGCTAT 


TGAGCAAGCA 


60 


25 


GCTCTAGCAG 


CGGCCAACTC 


CGCCCTTGCG 


AATGCTGTGG 


TGGTCCGGCC 


TTTCCTTTCC 


120 




CATCAGCAGG 


TTGAGATCCT 


TATAAATCTC 


ATGCAACCTC 


GGCAGCTGGT 


GTTTCGTCCT 


180 


30 


GAGGTTTTTT 


GGAATCACCC 


GATTCAACGT 


GTTATACATA ATGAGCTTGA GCAGTAHGC 


240 


CGTGCTCGCT 


CGGGTCGCTG 


CCTTGAGATT 


GGAGCCCACC 


CACGCTCCAT 


TAATGATAAT 


300 




CCTAATGTCC 


TCCATCGCTG 


CTTTCTCCAC 


CCCGTCGGCC 


GGGATGTTCA 


GCGCTGGTAC 


360 


35 


ACAGCCCCGA 


CTAGGGGACC 


TGCGGCGAAC 


TGTCGCCGCT 


CGGCACTTCG 


TGGTCTGCCA 


420 




CCAGCCGACC 


GCACTTACTG 


TT7TGATGGC 


TTTGCCGGCT 


GCCGTTTTGC 


CGCCGA6ACT 


480 


40 


GGTGTGGCTC 


TCTATTCTCT 


CCATGACiTG 


CAGCCGGCTG 


ATGTTGCCGA 


GGCGATGGCT 


540 


CGCCACGGCA 


TGACCCGCC^ 


TTATGCAGCT 


^TCCACTTGC 


CTCCAGAGGT 


GCTCCTGCCT 


600 




r^-^GGCACCT 


ACCGGACATC 


ATCC'ACTTG 


CTGATCCACG 


ATGGTAAGCG 


CGCGGHGTC 


660 


45 


ACTTATGAGG 


GTGACACTAG 


CGCCGoTTAC 


AATCATGATG 


TTGCCACCCT 


CCGCACATGG 


720 
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ATCAGGACAA C^AAou. ■ ~j ■ joo o'-'-' ^-^'^ i uGTGA TCGAGCGGGT GCGGGGTATT 780 

GGCTGTCAC^ "G"'-" 3^::-C'GC3 GCCCC^GAGC CC^CCCCGAT GCCCTACGTT 840 

CCTTACCC3C GT'C3-Cjj- ~:c^^^ y ~ ^GG^C^A'CT "GGGCCCGG CGGGTCCCCG 900 

"CGc^G^^c: cG"C:3r': --—-^^ ^cgccgtccc cacgcacatc 960 

^GGG-ccr: ^i^v-:'---^: :':g-^.:gacc ^ggcc^tttg ctgctccagg 1020 

ct'a^jAlG' ^cz^'yr'z'z :^''-]:'-' — 3:^'-ac^g 'gggtgccct ggtcgctaat 1080 

GAAGGCTGGA ATGCI"CC3- "^^^^ ^ -y^:^^"^ "ACGGCGGC TTACCTCACA 1140 

ATATGTCA^C ^GCr^A''^ '^::y:^:y:'■\ :C3"^^^C^A :GGGCATGCG CCGGCTTGAG 1200 

CTTGAACATG C^CAG-^-^ ^^'Z^y rC'ACAGCT GGCTATTTGA GAAGTCAGGT 1260 

CGTGATTACA TCCCAGGCCG CCAGC^VC^S ">"ACGCTC AGTGCCGCCG CTGGTTATCT 1320 

GCCGGGTTCC ATCTCGACC: CCGCACC"A GTTT^TGATG AGTCAGTGCC TTGTAGCTGC 1380 

CGAACCACCA TCCGGCGG-^ CGCTGGa::: '^'^GC^GTT ^TATGAAGTG GCTCGGTCAG 1440 

GAGTGTTCTT GTTtcc'CCA GCCCGCCG-G GGGCTGGCGG GCGACCAAGG TCATGACAAT 1500 

GAGGCCTATG AAGGCTCTGA TGTT3ATACT 3CTGAGCCTG CCACCCTAGA CATTACAGGC 1560 

TCATACATCG TGGATGGTCG GTCTC'GC-A AC^GTCTATC AAGCTCTCGA CCTGCCAGCT 1620 

GACCTGGTAG CTCGCGCAGC CCGACTGTCT GC^ACAGTTA CTGTTACTGA AACCTCTGGC 1680 

CGTCTGGATT GCCAAACAAT GATCGGCAAT a;GACTTTTC TCACTACCTT TGTTGATGGG 1740 

GCACGCCTTG AGGTTAACGG GCCTGAGC^G C^TAACCTCT CTTTTGACAG CCAGCAGTGT 1800 

AGTATGGCAG CCGGCCCGT' ^TGCCTCACC 'ATGCTGCCG TAGATGGCGG GCTGGAAGTT 1860 

CATTTTTCCA CCGCTGGCC^ CGAGAGCCGT GTTGTTTTCC CCCCTGGTAA TGCCCCGACT 1920 

GCCCCGCCGA GTGAGGTCAC CGCCTTCTGC "CAGCTCTTT ATAGGCACAA CCGGCAGAGC 1980 

CAGCGCCAGT CGGTTATTGG TAGTTTGTGG CTGCACCCTG AAGGTTTGCT CGGCCTGTTC 2040 

CCGCCCTTTT CACCCGGGCA 'GAGTGGCGG 'CTGCTAACC CATTTTGCGG CGAGAGCACG 2100 

CTCTACACCC GCACTTGGTC CACAATTACA GACACACCCT TAACTGTCGG GCTAATTTCC 2160 

GGTCATTTGG ATGC^GCTCC CCAC^CGGGG GGGCCACCTG CTACTGCCAC AGGCCCTGCT 2220 

GTAGGCTCGT CTGAC'C^C: mGACCCG-C CCGCTACCTG ATGTTACAGA TGGCTCACGC 2280 

CCCTCTGGGG CCCGTCCGGr T.GCCCC-C CCGAATGGCG TTCCGCAGCG CCGCTTACTA 2340 

CACACCTACC CTGACGGCG''- 'AAGAT:':' G^CGGCTCCA T^TTCGAGTC TGAGTGCACC 2400 
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TGGCTTGTC^ -CGCA^:'^:. CGCC-^^GCC^C CGCCCTGGTG GCGGGCTTTG TCATGCTTTT 2460 

TTTCAGCGTT ACC'-GA": G^^^G^CGCC ACCAAGTTTG TGATGCGTGA TGGTCTTGCC 2520 

GCGTATACCC ^^ACACCCCG GGCGATCAt- CATGCGGTGG CCCCGGACTA TCGATTGGAA 2580 

CATAACCCCA agagg:^:GA GGrGCCTAC CGCGAGACTT GCGCCCGCCG AGGCACTGCT 2640 

gcctatccac ':'^a]3:g: 'ggg-'^^^a: caggtgcctg ttagtttgag ttttgatgcc 2700 

TGGGAolGGA -^'^ 1! AG „ J ^ ^ ^ j.t.jACGAG GTTTACCiAA CAGAGCTGGC GGCTCGGTGG 2760 

TTTGAATCGA acGGCGGGGG TCAGCCGAGG TTGAACATAA CTGAGGATAC CGCCCGTGCG 2820 

GCCAACCTGG GCG^GGAGG' 'GAG^CCGGG agTGAAGTAG GCCGCGCATG TGCCGGGTGT 2880 

aaagtcgagc c^ggcg^^g' gcggtatcag ^-^tacagccg GTGTCCCCGG CTCTGGCAAG 2940 

TCAAAGTCCG ^GCAACAGGC GG-tg"GGAT gttgttgttg tgcccactcg CGAGCTTCGG 3000 

AACGCTTGGC GGCGCCGGGG C^TTGCGGCA TTCACTCCGC ACACTGCGGC CCGTGTCACT 3060 

AGCGGCCGTA GGG"GTCAT TGATGAGGCC CCTTCGCTCC CCCCACACTT GCTGCTTTTA 3120 

CATATGCAGC GTGCTGCATC TGTGCACCTC CTTGGGGACC CGAATCAGAT CCCCGCCATA 3180 

GATTTTGAGC ACACCGGTCT GATTCCAGCA ATACGGCCGG AGTTGGTCCC GACTTCATGG 3240 

TGGCATGTCA CCCACCGTTG CCCTGCAGAT GTCTGTGAGT TAGTCCGTGG TGCTTACCCT 3300 

AAAATCCAGA CTACAAGTAA GGTGCTCCGT TCCCTTTTCT GGGGAGAGCC AGCTGTCGGC 3360 

CAGAAGCTAG TGTTCACACA GGCTGCTAAG GCCGCGCACC CCGGATCTAT AACGGTCCAT 3420 

GAGGCCCAGG GTGCCACTTT TACCACTACA ACTATAATTG CAACTGCAGA TGCCCGTGGC 3480 

CTCATACAGT CCTCCCGGGC TCACGCTATA GTTGCTCTCA CTAGGCATAC TGAAAAATGT 3540 

GTTATACTTG ACTCTCCCGG CCTGTTGCGT GAGGTGGGTA TCTCAGATGC CATTGHAAT 3600 

AATTTCTTCC TTTCGGGTGG CGAGGTTGGT CACCAGAGAC CATCGGTCAT TCCGCGAGGC 3660 

AACCCTGACC GCAATGTTGA CGTGCTTGCG GCGTTTCCAC CTTCATGCCA AATAAGCGCC 3720 

TTCCATCAGC TTGCTGAGGA GCTGGGCCAC CGGCCGGCGC CGGTGGCGGC TGTGCTACCT 3780 

CCCTGCCCTG AGCTTGAGCA GGGCCTTCTC -^ATCTGCCAC AGGAGCTAGC CTCCTGTGAC 3840 

AGTGTTGTGA CATTTGAGCT AACTGACATT GTGCACTGCC GCATGGCGGC CCCTAGCCAA 3900 

AGGAAAGCTG TTTTGTCCAC GCTGGTA&GC CGGTATGGCA GACGCACAAG 6CTTTATGAT 3960 

GCGGG^CACA CCGATGTCCG CGCCTCCCTT GCGCGCTTTA TTCCCACTCT CGGGCGGGTT 4020 

ACTGCCACCA CCTGTGAACT CTTIGAGCTT GTAGAGGCGA TGGTGGAGAA GGGCCAAGAC 4080 



25. 




GGTTCAGCC3 "3::::;::A'G tccccgcat aacctttttc 414o 

CAGAAGGATT GTAAC^^G'^ CACGATCGI: ^agaCAA^tG CGCATGGCAA AGTCGGTCAG 4200 

5 GGTA^" GC^GGAG'AA GACG^' — G^ GCCrG^"G GCCCCTGGTT CCGTGCGATT 4260 

GAGAAGG^a ^r^'r':' ACGGGGATGC TTATGACGAC 4320 

TCAGTATTCT CTGCTGCi:' GG:'33:3:: a3::;-3::: ^GGTGTTTGA AAATGATTTT 4380 

10 

TCTGAGTTTG AC^CGAC'CA GAA'^^r ::rAr;3-c ^TGAGTGCGC CATTATGGAA 4440 

GAGTG^GG^A 'GCCCCAG'G Gr'G":A::3 "G^a^catg CCGTCCGGTC GGCGTGGATC 4500 

15 CTGCAGGCCC CAAAagag^C "^VAG-^vr. '^C^GGAAGA AGCATTCTGG TGAGCCGGGC 4560 

AGCTTGCTCT GGAATACGGT G'GGAA-'3 GCAATCATTG CCCATTGCTA TGAGTTCCGG 4620 

GACCTCCAGG TTGCCGCCTT CAAGGGCGAC GACTCGGTCG TCCTCTGTAG TGAATACCGC 4680 

20 

CAGAGCCCAG GCGCCGGTTC GCTTATAGCA GGCTGTGGTT TGAAGTTGAA GGCTGACTTC 4740 

CGGCCGATTG GGCTGTATGC CGGGGT^GTC GTCGCCCCGG GGCTCGGGGC CCTACCCGAT 4800 

25 GTCGTTCGAT TCGCCGGACG GCTTTCGGAG AAGAAC'GGG GGCCTGATCC 6GAGCGGGCA 4860 

6AGCAGCTCC GCCTCGCCGT GCAGGATTTC CTCCGTAGGT TAACGAATGT GGCCCAGATT 4920 

TGTGTTGAGG TGGTGTCTAG AGTTTACGGG GT7TCCCCGG GTCTGGTTCA TAACCTGATA 4980 

30 

GGCATGCTCC AGACTATTGG TGATGGTAAG GCGCAT'TTA CAGAGTCTGT TAAGCCTATA 5040 

CTTGACCTTA CACACTCAAT TATGCACCGG TCTGAATGAA TAACATGTGG TTTGCTGCGC 5100 

35 CCATGGGTTC GCCACCATGC GCCCTAGGCC "^CTT^'GCTG TTGTTCCTCT TGTTTCTGCC 5160 

TATGTTGCCC GCGCCACCGA CCGGTCAGCC GTCTGGCCGC CGTCGTGGGC GGCGCAGCGG 5220 

CGGTACCGGC GGTGGTTTCT GGGGTGACCG GGTTGATTCT CAGCCCTTCG CAATCCCCTA 5280 

40 

TATTCATCCA ACCAACCCCT TTGCCCCAGA CGTTGCCGCT GCGTCCGGGT CTGGACCTCG 5340 

CCTTCGCCAA CCAGCCCGGC CACTTGGCTC CACTTGGCGm uATCAGGCCC AGCGCCCCTC 5400 

45 CGCTGCCTCC CGTCGCCGAC CTGCCACAGC CGGGGCTGCG GCGCTGACGG CTGTGGCGCC 5460 

TGCCCATGAC ACCTCACCCG TCCCGGACGT 'GATTCTCGC GGTGCAATTC TACGCCGCCA 5520 

GTATAATTTG TCTACTTCAC CCC^GACATC CTCTGTGGCC TCTGGCACTA ATTTAGTCCT 5580 

50 

GTATGCAGCC CCCCTTAATC CGCCTC^GCC GCTGCAGGAC GGTACTAATA CTCACATTAT 5640 

GGCCACAGAG GCCTCCAAT^ a^gCACaG^A CCGGGTTGCC CGCGCTACTA TCCGTTACCG 5700 

55 GCCCCTAGTG CCTAATGCAG "'GGAGGC'a ^GCTATATCC ATTTCTTTCT GGCCTCAAAC 5760 
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AACCACAACC CC^;:;:C^3 '-3a::'G.^m -TCCAT-ac^ TCCACTGATG TCAGGATTCT 5820 

TGTTCAACCT GGCATAGCAT CTGAAT^GGT CATCCCAAGC GAGCGCCTTC ACTACCGCAA 5880 

TCAAGGT'GG CGC'CGG'^G AGACAT:"GG TGT'^GCTGAG GAGGAAGCCA CCTCCGGTCT 5940 

TGTC^TG"A -^cL-r-^ GC'^^:::G^ TAACTCC^at aCCAATACCC CTTATACCGG 6000 

TGCCC"GGC -G"- -.r-^A^A GC^^GAGTT' CGCAATCTCA CCACCTGTAA 6060 

CACCAA-ACA CG^G^G^::: G"ac-:ag CACGC^CG^ CACTCCGCCC GAGGGGCCGA 6120 

CGGGAC'GCG GAGr:A::A iaa:-;-.-,.: CACCAGG": ATGAAAGATC TCCACTTTAC 6180 

^5 CGGCCTTAAT GGGG^AGG'G AAGTC^GCCG CGGGATAGC'^ CTAACATTAC TTAACCTTGC 6240 

TGACACGCTC C^CGGCG^GC tccCGaca;- a^taatttcG TCGGCTGGCG GGCAACTGTT 6300 

TTATTCCCGC CCGGTTGTCT CAGCCAATGG CGAGCCAACC GTGAAGCTCT ATACATCAGT 6360 

20 

GGAGAATGCT CAGCAGGATA AGGGTGTTGC TATCCCCCAC GATATCGATC TTGGTGATTC 6420 

GCGTGTGGTC A--CAGGA-T atg:,ca:c:a GCATGAGCAG GATCGGCCCA CCCCGTCGCC 6480 

25 TGCGCCATCT CGGCCT-tt CTGTTC'CCG AGCAAATGAT GTACTTTGGC TGTCCCTCAC 6540 

TGCAGCCGAG TATGACCAGT CCACTTACGG GTCGTCAACT GGCCCGGTTT ATATCTCGGA 6600 

CAGCGTGACT TTGGTGAATG TTGCGACTGG CGCGCAGGCC GTAGCCCGAT CGCTTGACTG 6660 

30 

GTCCAAAGTC ACCCTCGACG GGCGGCCCCT CCCGACTGTT GAGCAATATT CCAAGACATT 6720 

CTTTGTGCTC CCCCTTCGTG GCAAGCTCTC CTTTTGGGAG GCCGGCACAA CAAAAGCAGG 6780 

^5 TTATCCTTAT AATTATAATA CTACTGCTAG TGACCAGATT CTGATTGAAA ATGCTGCCGG 6840 

CCATCGGGTC GCCATTTCAA CCTATACCAC CAGGC , TGGG GCCGGTCCGG TCGCCATTTC 6900 

TGCGGCCGCG GTTTTGGCTC CACGCTCCGC CCTGGCTCTG CTGGAGGATA CTTnGATTA 6960 

TCCGGGGCGG GCGCACACAT TTGATGACTT CTGCCCTGAA T6CCGCGCTT TAGGCCTCCA 7020 

GG6TTGTGCT TTCCAGTCAA CTGTCG:'GA GCTCCAGCGC CTTAAAGTTA AGGTGGGTAA 7080 

AACTCGGGAG TTGTAGTTTA TTTGGCTGTG CCCACCTACT TATATCTGCT GATTTCCTTT 7140 

ATTTCCTTTT TCTCGGTCCC GCGCTCCCTG A 7171 



The above sequence was obtained from 
polyadenylated clones. For clarity the 3' polyA 
"tail" has been omitted. 
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The sequence anov- .r;ci;:dos a partial cDNA 
sequence consisting o: : - 1 nucleotides that was 
identified in a previc;:.^ ^^pr 1 i ':ation in this series. 
The previously laentifi*::: p-^irtial sequence is set 
5 forth b^low, with cortair: ^-^-roc t ions (SEQ ID NO . 11 ) . 

The corrections include ::eleticn of the first 80 bases 
of the prior reported sequence, which are cloning 
artifacts; insertion o: 3 after former position 174, 
of C after 2 70, and of :^GCG after 279 ; change of C to 

10 T at former position 7 0'?, of GC to CG at 722-723 , of 
CC to TT at 1238-39, ano of C to G at 1606; deletion 
of T at former osition ~65; and deletion of the last 
11 bases of the former sequence, which are part of a 
linker sequence and are not or viral origin. 

15 Non-A Non-B T: ::exican Strain; SEQ ID NO . 1 1 

SEQ ID NO. 11 : 

GTTGCGTGAG GTGGG^ATC^ CAGA^SC:-^' 'gt'^-^at TTCTTCCTTT CGGGTGGCGA 60 

20 GGTTGGTCAC CAGAGACCAT CGGTC;<"C: rCG^GGCAAC CCTGACCGCA ATGTTGACGT 120 

GCTTGCGGCG TTTCCACCTT CATGCCAAAT AAGCGCCTTC CATCAGCTTG CTGAGGAGCT 180 

GGGCCACCGG CCGGCGCCGG TGGCGGCTGT GCTACCTCCC TGCCCTGAGC TTGAGCAGGG 240 

25 

CCTTCTCTAT CTGCCACAGG AGCTAGCCTC CTG'GACAGT GTTGTGACAT TTGAGCTAAC 300 

TGACATTGTG CACTGCCGCA TGGCGGCCCC TAGCCAAAGG AAAGCTGTTT TGTCCACGCT 360 

30 GGTAGGCCGG TATGGCAGAC GCACAAGGCT TTATGATGCG GGTCACACCG ATGTCCGCGC 420 

CTCCCTTGCG CGCTTTATTC CCACTC'CGG GCGGGTTACT GCCACCACCT GTGAACTCTT 480 

TGAGCTTGTA GAGGCGATGG TGGAGAAggg CCAAGACGGT TCAGCCGTCC TCGAGTTGGA 540 

35 

TTTGTGCAGC CGAGATGTCT CCCGCAt;ac CTTTTTCCAG AAGGATTGTA ACAAGTTCAC 600 

GACCGGCGAG ACAATTGCGC a-gg:aaag' CGGTCAGGGT ATCTTCCGCT GGAGTAAGAC 660 

^0 cttttgtgcc ctgtttggcc :c^gg":cg 'gcgat^gag aaggctattc tatccctttt 720 

ACCACAAGCT GTGTTCTACG GGGATG:*'A 'GACGAC^CA GTATTCTCTG CTGCCGTGGC 780 

TGGCGCCAGC CATGCCATGG 'G'"::--AA ^GATT^^-CT GAGTTTGACT CGACTCAGAA 840 

45 

taacttttcc ctagg^c^tg ag^gcg::a* -tggaagag 'gtggtatgc cccagtggct 900 

TGTCAGGTTG TACCATGCCG ^CCGG'IGG: ]TGGATCCTG CAGGCCCCAA AAGAGTCTTT 960 
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GAGAGGGTTC ^GG-^-".— ^T-C^r.^^.- G^GSGC^CG T^GCTCTGGA ATACGGTGTG 1020 

GAACATGGCA A':;"3::: A"3-:\-~,a oT-CCGGGAC C^CCAGGTTG CCGCCTTCAA 1080 

GGGCGACGAC 'CGG'CG^:: ^:^G'A3^GA at::c3::aG AGCCCAGGCG CCGGTTCGCT 1140 

ta^agcagg: ^3'::"':a :v'3a:.:g: ^3a:''::g3 ccga^'gggc tgtatgccgg 1200 

ggt^gtc3': gc:::;:.j.: "':3333 :::' ac::3a'3:: g^^cgattcg ccggacggct 126O 

TTCGGA3AAG a::-;333 : :-3A-:33a 3C.33.:^GA3 CAGC^CCGCC TCGCCGTGCA 1320 

GGAT^^CrC :G'A33^':a :^a:-3-33: C:a3A"'3^ GT^GAGGTGG TGTCTAGAGT 1380 

15 TTACGGGG'T :CC::333': ^GG'^CAtaa :C^3A"AGGC ATGCTCCAGA CTATTGGTGA 1440 

TGGTAAGGCG CATT^^acag AGTCTGTTAA GCCTATACTT GACCTTACAC ACTCAATTAT 1500 

gcaccggtc^ gaatgaa-aa catgtggttt GCTGCGCCCA TGGGTTCGCC ACCATGCGCC 1560 

CTAGGCCTCT TTTGC 1575 



20 



25 



30 



35 



When comparing the Burmese and Mexican 
strains, 75.7% identity is seen in a 7189 nucleotide 
overlap beginning at nucleotide 1 of the Mexican 
strain and nucleotide 25 of the Burmese strain. 

In the same manner, a different strain of 
HEV was identified in an isolate obtained in Tashkent^ 
U.S.S.R, The Tashkent sequence is given below (SEQ ID 
NO. 12 ) : 

SEQ ID NO. 12 : 

CGGGCCCCGT ACAGGTCACA ACCTGTGAGT TGTACGAGCT AGTGGAGGCC ATGGTCGAGA 60 
AAGGCCAGGA TGGCTCCGCC GTCCTTGAGC TC6ATCTCTG CAACCGTGAC GTGTCCAGGA 120 
TCACCTTTTT CCAGAAAGAT TGCAATAAGT TCACCACGGG AGAGACCATC GCCCATGGTA 180 
^0 AAGTGGGCCA GGGCATTTCG GCCTGGAGTA AGACCTTCTG TGCCCTTTTC GGCCCCTGGT 240 

TCCGTGCTAT TGAGAAGGCT ATTCTGGCCC TGCTCCCTCA GGGTGTGTTT TATGGGGATG 300 
CCTTTGATGA CACCGTCTTC TCGGCGCGTG TGGCCGCAGC AAAGGCGTCC ATGGTGTTTG 360 

45 

AGAATGACTT TTC^GAGTT^ GACTCCACCC AGAATAATTT TTCCCTGGGC CTAGAGTGTG 420 
CTATTATGGA GAAGTGTGGG ATGCCGAAGT GGCTCATCCG CTTGTACCAC CTTATAAGGT 480 
50 CTGCGTGGAT CCTGCAGGCC CCGAA33AGT CCCTGCGAGG GTGTTGGAAG AAACACTCCG 540 

GTGAGCCCGG CACTCTTC- tggaatacTG ^CTGGAACAT GGCCGTTATC ACCCATTGTT 600 
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ACG;^""^CCG CGA"^GCAG G^GGC'G:CT T^A^^AGGTGA TGATTCGATA GTGCTTTGCA 660 

GTGAG^ACCG "^CAGAG^C:- GGGGC^GC'G TCCTG-^TGC "GGCTGTGGC TTAAAGCTGA 720 

5 

AGGTGGGTTT CCGTCCGA'T GGTTTGTA'^G CAGGTGTTGT GGTGACCCCC GGCCTTGGCG 780 

CV:'^.::CGA rGTCGTGCGC TTGTCCGGC: GGCTTAC'GA GAAGAATTGG GGCCCTGGCC 840 

10 C^GAGGGGGC GGAGCAGC^C CGCCTTGCTG TGCG 874 

As shown in the following comparison of 
sequences, the Tashkent (Tash.) sequence more closely 
resembles the Burma sequence than the Mexico sequence, 
15 as would be expected of two strains from more closely 
related geographical areas. The numbering system used 
in the comparison is based on the Burma sequence. As 



indicated previously, Burma has SEQ ID NO: 6; Mexico, 
SEQ ID NO: 10; and Tashkent, SEQ ID NO: 12. The 
20 letters present in the lines between the sequences 
indicate conserved nucleotides. 

IGv 20v 30v 40v 50v 60v 

-BURMA AGGCAGACCACATATGTGGTCGATGCGATGGAGGCCCATCAGTTTATTAAGGCTCCTGGCA 
25 GCCATGGAGGCCCA CAGTT ATTAAGGCTCCTGGCA 

-MEXICO GCCATGGAGGCCCACCAGTTCATTAAGGCTCCTGGCA 

70v BOv 90v lOOv llOv 120v 

-BURMA TCACTACTGCTATTGAGCAGGCTGCTCTAGCAGCGGCCAACTCTGCCCTGGCGAATGCTG 
30 TCACTACTGCTATTGAGCA GC GCTCTAGCAGCGGCCAACTC GCCCT GCGAATGCTG 

-MEX I CO TCACTACTGCTATTGAGCAAGCAGCTCTAGCAGCGGCCAACTCCGCCCTTGCGAATGCTG 

130v i40v 150v 160v 170v 180v 

-BURMA TGGTAGTTAGGCCTTTTCTCTCTCACCAGCAGATTGAGATCCTCATTAACCTAATGCAAC 
35 TGGT GT GGCCTTT CT TC CA CAGCAG TTGAGATCCT AT AA CT ATGCAAC 

-MEXICO TGGTGGTCCGGCCTTTCCTTTCCCATCAGCAGGTTGAGATCCTTATAAATCTCATGCAAC 

190v 200v 210v 220v 230v 240v 

-BURMA CTCGCCAGCTTGTTTTCCGCCCCGAGGTTTTCTGGAATCATCCCATCCAGCGTGTCATCC 
40 CTCG CAGCT GT IT CG CC GAGGTTTT TGGAATCA CC AT CA CGTGT AT C 

-MEXICO CTCGGCAGCTGGTGTTTCGTCCTGAGGTTTTTTGGAATCACCCGATTCAACGTGTTATAC 

250v 260v 270v 280v 290v 300v 

-BURMA ATAACGAGCTGGAGCTTTACTGCCGCGCCCGCTCCGGCCGCTGTCTTGAAATTGGCGCCC 
45 ATAA GAGCT GAGC TA TGCCG GC CGCTC GG CGCTG CTTGA ATTGG GCCC 

-MEXICO ATAATGAGCTTGAGCAGTATTGCCGTGCTCGCTCGGGTCGCTGCCTTGAGATTGGAGCCC 

310v 320v 330v 340v 350v 360v 

-BURMA ATCCCCGCTCAATAAATGATAATCCTAATGTGGTCCACCGCTGCTTCCTCCGCCCTGTTG 
50 A CC CGCTC AT AATGATAATCCTAATGT TCCA CGCTGCTT CTCC CCC GT G 

-MEXICO ACCCACGCTCCATTAATGATAATCCTAATGTCCTCCATCGCTGCTTTCTCCACCCCGTC6 
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lOOv 410v 420v 

-BURMA GGCS-'.^^r^^-^T:: - 'V':'^::AC-GCGGGCCGGCTGCTAATTGCCGGC 
G CG — - - .CT G GG CC GC GC AA TG CG C 

-MEXICO GCCGGG"'":":---^^^'.'"-^-'- " " ' ■ ^.^^"C--^:^.-GGGGACCTGCGGCGAACTGTCGCC 

■^3Cv . .: ' :;nGv 470v 480v 

-Bu^^^^- ;;"-vr: . r-:]'-^^ /: :^^cttactgcctcgacgggttttctg 
3 ^: 3: r :^ v-:cgcacttactg t ga gg tit c g 

-MEXICO gc"C3G^-:":g' ^: "-'::''^"cgcacttactgttttgatggctttgccg 

^5Cv 5"" 520v 530v 540v 

-BUkMA GC'G^A:.r"*:ccGCC::" ^-■'"'r,:A'CG:CC:CTACTCCCTTCArGATATGTCACCAT 

GC^G CCSCCG-r-^-GG ' CTCTA TC CT CAT6A TG CC 

-MEXICO GC"GCCG"^^GCCGCCG-GA:TGGTGTGGC^CTCTATTCTCTCCATGACnGCA6CCGG 

^50v 56Cv' 5rOv 580v 590v 600v 

-BURMA CTGATGTCGCCGAGGC'V^^G — CCGC'hTGGtaTGACGCGGCTCTATGCCGCCCTCCATC 

CTGATGT GCCGAGGC -TG CGCC^ GG ATGAC CG CT TATGC GC TCCA 
-MEXICO CTGATG^TGCCG-^GG^G^'r-^f'^rGCCACGGCATGACCCGCCTTTATGCAGCTTTCCACT 

610v ^2^-} H3nv 640v 650v 660v 

-BURMA TTCCGCC^GAGGTrcT-^r^vCCCCC'GGC^CATATCGCACCGCATCGTATTTGCTAATTC 
T CC CC GAGG- CT C-GC'^ CCT.GC^C TA CG AC CATC TA TTGCT AT C 
-MEXICO "GCCTCCAG-GG-G:'CC*G'"C':r^"iGC-CCTACCGGACATCATCCTACTTGCTGATCC 

670v 5-,?v ^^9Gv ^OOv 710v 720v 

-BURMA ATGACGGTAGGCGCGTTG'GG-rGACGTATGAGGGTGATACTAGTGCTGGTTACAACCACG 
A GA GG^A GCGCG GT G^ AC TA^GAGGGTGA ACTAG GC GGTTACAA CA G 
-MEXICO ACGATGGTAAGCGCGCGG"VCAr-TATGiGGGTGACACTAGCGCCGGTTACAATCATG 

730v 7aOv 750v 760v 770v 780v 

-BURMA ATGTCTCCAACTTGCGC'^CCTGGATTAGAACCACC'iAGGTTACCGGAGACCATCCCCTCG 
ATGT CCA C T CGC : TGG^^ -^G AC AC AAGGTT GG GA CA CC T G 
-MEXICO ATGTTGCCACCC'CCGCACA'GGATCAGGACAACTAAGGTTGTGGGTGAACACCCTTTGG 

790v SOCv olOv 820v 830v 840v 

-BURMA TTATCGAGCGGGTTAGGGCCATTGGCTGCCACTTTGTTCTCTTGCTCACGGCAGCCCCGG 
T ATCGAGCGGGT GGG ATTGGCTG CACTTTGT T TTG TCAC GC GCCCC G 
-MEXICO TGATCGAGCGGGTGCGGGGTATTGGCTGTCACTTTGTGTTGTTGATCACTGCGGCCCCT6 

850v S60v ?70v 880v 890v 900v 

-BURMA AGCCATCACCTATGCCnATG^-CTTACCCCCGGTCTACCGAGGTCTATGTCCGATCGA 
AGCC TC CC ATGCC GiTCCTTACCC CG TC AC GAGGTCTATGTCCG TC A 
-MEXICO AGCCCTCCCCGflTGCCCTACGTTCCTTACCCGCGTTCGACGGAGGTCTATGTCCGGTCTA 

91Cv 92Cv 93Cv 940v 950v 960v 

-BURMA TCTTCGGCCCGGG^GGCAflCC^-rC^TATTCCCAACCTCATGCTCCACTAAGTCGACCT 
TCTT GG CC GG GG ^CC: 'C ^ ^^CCC ACC C TG C AAGTC AC T 
-MEXICO "CTT-GGG:c:GGCGGG': "..:-G"CCCGACCGCTTGTGCTGTCAAGTCCACTT 

Moov loiov io20v 

-BURMA TCCATGr-rc^:-GC:'^'^'-":--":CCGTC:TATGCTGTTCGGGGCCACCnGGATG 
T CA GC V^r ' ' ■ -G:rrGTC^ ATGCT IT GGGGCCACC T GA G 

-MEXICO ""CACGCCG'T'^: *• :g:' "AT'GGGACCG^CTCATGCTCTTTGGGGCCACCCTCGACG 



31. 



I-:-- :c5Cv io70v losov 

-BURMA ACC-^GC:""3:".C': :r:"^'^-TjACC'ACC:TCGCGGCATTAGCTACAAGGTCA 

Acc^ .3 ^ ;^3-. -cc^^CG ggcattagcta aaggt a 

5 -MEXICO ACCAoV^: r.^-'::^ ::^^:3r--G:crAC:TTCGTGGCATTAGCTATAAGGTAA 

iizov mov ii40v 

-BLR^^ C'^'" :r^:::"3-V'^A^:^: .::-'y^^-T;,:c^CTGAGGACGCCCTCACAGCTG 

:^G^ ^^3^ r 3^'-^':.A"Ggc^gG'^atgcc c gagga gc ctcac gc g 

10 -MEXICO CTG-3rrG::r:,G'.:::^^A-G::GGC^rGAATGCCACCGAGGATGCGCTCACTGCAG 

1:::- ^i"^:- iisov iigov i200v 

-BURMA iTA7CAC'j::3::'^.::";CCAT"GCCACC;GCGGTATCTCCGCACCCAGGCTATAT 

"TAT AC g: gc ac ^g ca cagcg tat t cg acccaggc at t 
15 -MEXICO ^tat-:cggcgg:"ac:t:: - -:^gtcatcagcgttatttgcggacccaggcgattt 

IZlOv 1220v I230v 1240v 1250v 1260v 

-BURMA ccaagggg^^gcgtcgtc^gg-acgggagcatgcccagaagtttataacacgcctctaca 
c aaggg atgcg cg ct ga c ga catgc cagaa that cacgcctctaca 
20 -MEXICO ctaagggcatgcgccggc^tgagcttgaacatgctcagaaatttatttcacgcctctaca 

:270v 12S0' i290v 1300v 1310v 1320v 

-BURMA gc-^ggc'C'tcgagaagtccggccgtgattacatccctggccgtcagttggagttctacg 
gctggct n gagaagtc gg cgtgattacatccc ggccg cag tg agttctacg 
25 -MEXICO gctggctatttgagaagtcaggtcgtgattacatcccaggccgccagctgcagttctacg 

1330v 1340v 1350v 1360v 1370v IBSOv 

-BURMA cccagtgcaggcgctggctctccgccggctttcatcttgatccacgggtgttggtttttg 
c cagtgc g cgctgg t to gccgg tt catct ga gc cg tt gtttttg 

30 -MEXICO CTCAGTGCCGCCGCTGGT-ATCTGCCGGGTTCCATCTCGACCCCCGCACCTTAGTnrrG 

1390v MOCv iJlOv 1420v 1430v 1440v 
-BURMA ACGAGTCGGCCCCCTGCCATTGTAGGACCGCGATCCGTAAGGCGCTCTCAAAGTTTTGCT 
A GAGTC G CC TG TG G ACC C ATCCG G AAA TTTTGCT 

35 -MEXICO ATGAGTCAGTGCCTTGTAGCTGCCGAACCACCATCCGGCGGATCGCTGGAAAATTnGCT 

1450v 1460v 1470v 1480v 1490v 1500v 
-BURMA GCTTCATGAAGTGGCTTGGTCAGGAGTGCACCTGCTTCCTTCAGCCTGCAGAAGGCGCCG 
G TT ATGAAGTGGCT GGTCAGGAGTG C TG TTCCT CAGCC GC GA 6G G 
^0 -MEXICO GTTTTATGAAGTGGCTCGGTCAGGAGTGTTCTTGTTTCCTCCAGCCCGCCGAGGGGCTGG 

1510v 1520v 1530v 1540v 1550v 1560v 
-BURMA TCGGCGACCAGGGTCATGATAATGAAGCCTATGAGGGGTCCGATGTTGACCCTGCT6AGT 
GGCGACCA GGTCATGA AATGA GCCTATGA GG TC GATGTTGA CTGCTGAG 
^5 -MEXICO CGGGCGACCAAGGTCATGACAATGAGGCCTATGAAGGCTCTGATGTTGATACTGCTGAGC 

1570v 15£0v I590v 1600v 1610v 1620v 
-BURMA CCGCCATTAGTGACATATCTGGGTCCTATGTCGTCCCTGGCACTGCCCTCCAACCGCTCT 
C GCCA GACAT : GG ^C "A TCGT TGG C CT CAA C TCT 
50 -MEXICO CTGCCACCCTAGACATT^CAGGCTCATACATCGTGGATGGTCGGTCTCTGCAAACTGTCT 

163Cv 1640v I650v 1560v 1670v 1680v 
-BURMA ACCAGGCCCTCG^'CTCC CGC^GAGATTGTGGCTCGCGCGGGCCGGCTGACCGCCACAG 
55 A CA GC CTCGA CT CC GCTGA T GT GCTCGCGC G CCG CTG C GC ACAG 

-MEXICO ATCAAGCTCTCGACCTGCCAGCTGACCTGGTAGCTCGCGCAGCCCGACTGTCT6CTACAG 
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:6GC. :7CGv 17:Cv 1720v 1730v 1740v 

-BURMA taaaggtc'cc:;gg':gatgggcgga-cgattgcgagacccttcttggtaacaaaacct 

T A GT : A C -GG CG ^ GATTGC A AC T T GG AA AA AC T 
5 -MEXICO TTAC^G^ACGAAACC'C^GGCCGTC^GGATTGCCAAACAATGATCGGCAATAAGACTT 

1750v :^eOv 1770v 1780v 1790v 1800v 
-BURMA TTCGCACG:lo^'C3''G-^:GGoGCGGTCTtaGAGACCAATGGCCCAGAGCGCCACAATC 
TTC CAC C G^^GA GGGGC C T GAG AA GG CC GAGC C AA C 
10 -MEXICO TTC^CAC^ACCrTTr,--GATGGGGCACGCCTTGAGGTTAACGGGCCTGAGCAGCTTAACC 

15: :aZ". i830v 1840v 1850v 1860v 

-BURMA TCTCC^^CGA^GCCAGM^-.AGCACTATGGCCGCTGGCCCTTTCAGTCTCACCTATGCCG 

TC'C :t GA C CAo G a TATGGC gc ggccc tt g ctcacctatgc g 
15 -MEXICO tctcttt^gacagccag:agtgtagtatggcagccggcccgttttgcctcacctatgctg 

1370v ISSCv !890v 1900v 1910v 1920v 

-BURMA cctctgcagctgggctggaggtgcgctatgttgctgccgggcttgaccatcgggcggttt 

CC G G GGGCTGGA GT C T T C gc GG CT GA CG G GTTT 
20 -MEXICO CCGTAGATGGCGGGCTGGAAGTTCATTTTTCCACCGCTGGCCTCGAGAGCCGTGTTGTTT 

1930v 1940v 1950v 1960v 1970v 1980v 
-BURMA TTGCCCCCGGTGTTTCACCCCGGTCAGCCCCCGGCGAGGTTACCGCCTTCTGCTCTGCCC 
T CCCC GGT T C CC C C CC G GAGGT ACCGCCTTCTGCTC GC C 
25 -MEXICO TCCCCCCTGGTAATGCCCCGACTGCCCCGCCGAGTGAGGTCACCGCCnCTGCTCAGCTC 

1990v 2000v 2010v 2020v 2030v 2040v 
-BURMA TATACAGGTTTAACCGTGAGGCCCAGCGCCATTCGCTGATCGGTAACTTATGGTTCCATC 
T TA AGG AACCG AG CCAGCGCCA TCG T AT GGTA TT TGG T CA C 
30 -MEXICO TTTATAGGCACAACCGGCAGAGCCAGCGCCAGTCGGTTATTGGTAGTTTGTGGCTGCACC 

2050v 2060v 2070v 2080v 2090v 2100v 
-BURMA CTGAGGGACTCATTGGCCTCTTCGCCCCGTTTTCGCCCGGGCATGTTTGGGAGTCGGCTA 
CTGA GG T T GGCCT JJC C CC TTTTC CCCGGGCATG TGG GTC GGTA 
35 -MEXICO CTGAAGGTTTGCTCGGCCTGTTCCCGCCCTTTTCACCCGGGCATGAGTGGCGGTCTGCTA 

2110v 2120v 2130V 2I40v 2150v 2160v 
-BURMA ATCCATTCTGTGGCGAGAGCACACTTTACACCCGTACTTG6TCGGAGGTTGATGCCGTCT 
A CCATT TG GGCGAGAGCAC CT TACACCCG ACTTGGTC Tl G C 
^0 -MEXICO ACCCATTTTGCGGCGAGAGCACGCTCTACACCCGCACTTGGTCCACAAnACAGACACAC 

2170v 2180v 2190v 2200v 2210v 2220v 
-BURMA CTAGTCCAGCCCGGCCTGACTTAGGTTTTATGTCTGAGCCTTCTATACCTA6TAGGGCCG 
C C G C GGC T GGT T TG TG CT C C G GG C 
45 -MEXICO CCTTAACTGTCGGGCTAATTTCCGGTCATTTGGATGCTGCTCCCCACTCGGGGGG6CCAC 

2230v 2240v 2250v 2260v 2270v 2280v 
-BURMA CCACGCCTACCCTGGCGGCCCCTCTACCCCCCCCTGCACCGGACCCTTCCCCCCCTCCCT 
C C CT CC G C CT TA C C CTG C C CCC C 

50 -MEXICO CTGCTACTGCCACAGGCCCTGCTGTAGGCTCGTCTGACTCTCCAGACCCTGACCCGCTAC 

2290v 2300v 2310v 2320v 2330v 2340v 
-BURMA CTGCCCCGGCGCTTGCTGAGCCGGCTTCTGGCGCTACCGCCGGGGCCCCGGCCATAACTC 
CTG C TG C C TCTGG GC C G 6 CCC C AT 

55 -MEXICO CTGATGTTACAGATGGCTCACGCCCCTCTGGGGCCCGTCCGGCT6GCCCCAACCC6AATG 
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2:^:. .3^:. :330v 2390v 2400v 

-BURMA i::-3-:r:r"r^<-' r.:'':: -cccGGATGGCTCTAAGGTATTCG 
c CC: CoTCjC ' :: :ac.:^accc ga ggc ctaag t t g 

-MEXICO GCGT^::3:^G- ■:':::3r^-";:ACAC-ACCCTGACGGCGCTAAGATCTATG 

5 

2^::. 2':2^ 2^3Cv 2^^0v 2450v 2460v 
-BUR^A ::Gr^:3r"^^l'^G''^-:-^lC^"GT3GC^:3"AACGCGTCTAATGTTGACCACC 

CGy:^: ^ ''CGAo^ ^3:ac ^ggc^ g'^ aacgc tctaa g g ccacc 

-MEMCC ^CG3r::"'''':GAG^"'3"G'G:ACC"3C'"GTCAACGCATCTAACGCCGGCCACC 

10 

2^^:- 2:^:. 2^^2. 2500v 2510v 2520v 
-BURMA 3CC2^33:33C333.r^'' - :--"^^ACCAAAGGTACCCCGCCTCCTTTGATGCTG 
GC:C^33 ^GCGGG^^^'-. *^'3: 3 ^ACCC G TC TTTGA GC 

15 -MEXICO 3:22^33'33:333C"":'':':r"""CAG:GTTACCCTGATTCGTTTGACGCCA 

2532. 25^-. 255Cv :560v 2570v 2580v 
-BURMA CCTCTTT'G^GATGCGCG-=.:33C3:GGCCGCGTACACACTAACCCCCCGGCCAATAATTC 
CC ^T'GTGATGCG GA 33 GCCGCGTA AC CT AC CCCCGGCC AT ATTC 
20 -MEXICO CCAAGTTTGTGATGCGT3^'33t::tgCCGCGTATACCCTTACACCCCGGCCGATCATTC 

2590v 25CC. :61Cv 2620v 2630v 2640v 
-BURMA ACGC'3TCGCCCCTGAT^:.':33''3G-iCATAACCCAAAGAGGCTTGAGGCTGCTTATC 
A GC GT 3CCCC GA 3 ' ^ :GAm '::iTAACCC AAGAGGCT GAGGCTGC TA C 
25 -MEXICO ATGCGG^GGCCCCG-^iA,:':'-, -'33hACATAACCCCAAGAGGCTCGAGGCTGCCTACC 

265Cv 2660.' 2670v 2680v 2690v 2700v 
-BURMA GGGAAACTTGCTCCCGCC'CGGrACCbCTGCATACCCGCTCCTCGGGACCGGCATATACC 
G GA ACTTGC CCCGCC GGCAC GCTGC TA CC CTC T GG C GGCAT TACC 
30 -MEXICO GCGAGACTTGCGCCCGCC3AGGCACTGCTGCCTATCCACTCTTAGGCGCTGGCATTTACC 

Z710v 2720v 2730v 2740v 2750v 2760v 
-SURMA AGGTGCCGATCGGCCCCAGTTT7GACGCCTGGGAGCGGAACCACCGCCCCGGGGATGAGT 
AGGTGCC T G AGTTTTGA GCCTGGGAGCGGAACCACCGCCC GA GAG 
35 -MEXICO AGGTGCCTGTTAGTTTGAGT-TTGATGCCTGGGAGCGGAACCACCGCCCGnTGACGAGC 

2770v :780v 2790v 2800v 2810v 2820v 
-BURMA TGTACCTTCCTGAGCTTGCTGCCAGATGGTTTGAGGCCAATAGGCCGACCCGCCCGACTC 
T TACCT C GAGCT GC GC G TGGTTTGA CCAA G CC C CC AC 
40 -MEXICO TTTACCTAACAGAGCTGGCGGC^CGGTGGTTTGAATCCAACCGCCCCGGTCAGCCCACGT 

2830v 284Cv 2S50v 2860v 2870v 2880v 
-BURMA TCACTATAACTGAGGATG'"GCACGGA_AGCOA^ TGGCCATCGAGCTTGACTCAGCCA 
T A ATAACTGAGGAT GC CG C GC AA CTGGCC T GAGCHGACTC G A 
45 -MEXICO TGAACATiAC^GAGGATA'-CGCCCoTGCGGCCAACCTGGCCCTGGAGCTTGACTCCGGGA 

2890v 2900v 2910v 2920v 2930v 2940v 
-BURMA CAGATGTCGGCCGGGCCT3TGCCGGCTGTCGGGTCACCCCCGGCGTTGTTCAGTACCAGT 
GA GT GGCCG GC TGTGCCG3 ^GT GTC CC GGCGTTGT C GTA CAGT 
50 -MEXICO GTGAAGTAGGCCGCGCATGTGCCGGGTGTAAAGTCGAGCCTGGCGnGTGCGGTATCAGT 

2950v 2960v 2970v 2980v 2990v SOOOv 
-BURMA TTACTGCAGGTGTGCCTG3ATCCGGCAAGTCCCGCTCTATCACCCAAGCC6ATGTGGACG 
TTAC GC 3GTGT CC G3 TC GGCAAGTC TC T CA GC GATGTGGA 6 
55 -MEXICO TTACAGCCGGTGTCCCCGGCTCTGGCAAGTCAAAGTCCGTGCAACAGGCGGATGTGGATG 
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3010v 3030v 3040v 3050v 3060v 

-BURMA TTG^CGTGG^::::^-CGCGTriAGT^GCGTAATGCCTGGCGCCGTCGCGGCTTTGCTGCn 

TTGT GT -y :: '.mg t cg aa gc tggcg cg cg GGcmbL i 
-MEXICO ttgttg-^3^g:::^.:^c3:gagcttc3Gaacgcttggcggcgccggggctttgcggcat 

5 

BC^':^ y-c:- 3090. 3100v 3110v 31. Ov 
-qhcHm ""ACC -'^"'^::'r.: :^GAG^CAC:CAGGGGCGCCGGGTTGTCATTGATGAGGCTC 

T a: :::gc^ -^r^c . a':-: gg cg gggttgtcattgatgaggc c 

-HE> :C0 TCAC^::G0- :3^G'C:-C^AGrGGCCGTAGGGTTGTCATTGATGAGGCCC 

10 

313:. BlSCv 3160v 3170v 3180v 

-BURMA CATCCC^CC:::C^C-::^GC^GC'GCTCCACATGCAGCGGGCCGCCACCGTCCACCnC 
C TC C^CCCCCC CAC ^GC^GCT T CA ATGCAGCG GC GC C 6T CACCT C 
-MEXICO CTTCGC^CCCCCCACACT-GC^GCTTTTACATATGCAGCGTGCTGCATCTGTGCACCTCC 

15 

3190v 320Cv 3210V 3220v 3230v 3240v 
-BURMA TTGGCGACCCGAACCAGATCCCAGCCATCGACTTTGAGCACGCTGGGCTCGTCCCCGCCA 
TTGG GACCCGAA CAGAtccc GCCAT GA TTTGAGCAC C GG CT T CC GC A 
-MEXICO TTGGGGACCCGAATCAGATCCCCGCCATAGATTTTGAGCACACCGGTCTGATTCCAGCAA 

20 

3250v 3260v 32 70v 3280v 3290v 3300v 
-BURMA TCAGGCCCGACTTAGGCCCCACCTCCTGGTGGCATGTTACCCATCGCTGGCCTGCGGATG 
T GGCC GA G CCC AC TC TGGTGGCATGT ACCCA CG TG CCTGC GATG 
-MEXICO TACGGCCGGAGTTGGTCCCGACTTCATGGTGGCATGTCACCCACCGTTGCCCTGCAGATG 

25 

3310v 332Cv 3330v 3340v 3350v 3360v 
-BURMA TATGCGAGCTCATCCGTGGTGCATACCCCATGATCCAGACCACTA6CCGGGTTCTCCGTT 
T TG GAG T TCCGTGGTGC TACCC A ATCCAGAC AC AG GGT CTCCGTT 
30 -MEXICO TCTGTGAGTTAGTCCGTGGTGCTTACCCTAAAATCCAGACTACAAGTAA6GTGCTCCGTT 

3370v 3380v 3390v 3400v 3410v 3420v 
-BURMA CGTTGTTCTGGGGTGAGCCTGCCGTCGGGCAGAAACTAGTGTTCACCCAGGCGGCCAAGC 
C T TTCTGGGG GAGCC GC GTCGG CAGAA CTAGTGTTCAC CAGGC GC AAG 
35 -MEXICO CCCTTTTCTGGGGAGAGCCAGCTGTCGGCCAGAAGCTAGTGTTCACACAGGCTGCTAAGG 

3430V 3440v 3450v 3460v . 3470v 3480v 
-BURMA CCGCCAACCCCGGCTCAGTGACGGTCCAC6AGGCGCAGGGCGCTACCTACACGGAGACCA 
CCGC ACCCCGG TC T ACGGTCCA GAGGC CAGGG GC AC T AC AC A 
40 -MEXICO CCGCGCACCCCGGATCTATAACGGTCCATGAGGCCCAGGGTGCCACTTTTACCACTACAA 

3490v 3500v 3510v 3520v 3530v 3540v 
-BURMA CTATTATTGCCACAGCAGATGCCCGGG6CCTTATTCAGTCGTCTCGGGCTCATGCCATTG 
CTAT ATTGC AC GCAGATGCCCG GGCCT AT CAGTC TC C6GGCTCA GC AT 6 
45 -MEXICO CTATAATTGCAACTGCAGATGCCCGTGGCCTCATACAGTCCTCCCGG6CTCACGCTATAG 

3550v 3560v 3570v 3580v 3590v 3600v 
-BURMA TTGCTCTGACGCGCCACACTGAGAAGTGCGTCATCATTGACGCACCAGGCCTGCnCGCG 
TTGCTCT AC G CA ACTGA AA TG GT AT TTGAC C CC GGCCTG T C6 G 
50 -MEXICO TTGCTCTCACTAGGCATACTGAAAAATGTGTTATACTTGACTCTCCCGGCCTGTT6CGTG 

3610v 3e20v 3630v 3640v 3650v 3660v 
-BURMA AGGTGGGCATCTCCGA"G"AATCGTTAATAACTTTTTCCTCGCTGGTGGCGAAATTGGTC 
AGGTGGG ATC^C ^a-TG: AT GT^AATAA TT TTCCT C GGTGGCGA TTGGTC 
55 -MEXICO AGGTGGGTATCTCAGATGCCA-^^GTTAATAATTTCTTCCTTTCGGGTGGCGAGGTTGGTC 
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367C.^ 358:. 369C. 3700v 3710v 3720v 

-BURMA accagcg::c:tcagtta":::cgtggcaaccc"gacgccaatgttgacaccctggctg 
ACCAG G CC-TC GT mT'c: cg ggcaaccctgac caatgttgac ct gc g 
-MEXICO accagagacc-tcgg^:a"::g:gaggca:ccctgaccgcaatgttgacgtgcttgcgg 

3^3:. 27^:, 3760v 3770v 3780v 

.euPMi cc^^:::g::g^:"gcc-".-^"ag^gccccat:;g'tggctgaggagcttggccaca 
c ag gcc^cca^cag gctgaggagct ggccac 

-MEXICO CG"'::ACC:^nv::^:.-:AGCG::"CCAT:AGCTTGCTGAGGAGCTGGGCCACC 



-BURMA GACC^ 
G CC 

-MEXICO GGCCG 



3320v 3830v 3840v 
^GCCCCGAGCTCGAACAGGGCCTTCTCT 
^GCOC GAGCT GA CAGGGCCTTCTCT 
^GCCCTGAGCTTGAGCAGGGCCTTCTCT 



3550v 3cc:. 35^:. 3S60v 3890v 3900v 

-BUPMA acc^gocco-ggagc'ca::;cc'g^ga';gtgtcgtaacatttgaattaacagacattg 

A CTG_ C^GGAGC^ OC OCTGTG; AGTGT GT ACATTTGA TAAC GACATTG 
-MEXICO ATCTGCCACAGGAGC^AGCC^CCTGTGACAGTGTTGTGACATTTGAGCTAACTGACATTG 

39:0v 3920. 3930v 3940v 3950v 3960v 
-BURMA TGCAC^GCCGCATGGCCGCCCCGAGCCAGCGCAAGGCCGTGCTGTCCACACTCGTGGGCC 
TGCACTGCCGCATGGC GCOOC AGCCA G AA GC GT TGTCCAC CT GT GGCC 
-MEXICO TGCACTGCCGCATGGCGGOCCCTAGCCAAAGGAAAGCTGTTTTGTCCACGCTGGTAGGCC 

3970v 3980v 3990v 4000v 4010v 4020v 
-BURMA GCTACGGCGGTCGCACAAAGCTCTACAATGCTTCCCACTCTGATGTTCGCGACTCTCTCG 
G TA GGC G CGCACAA GCT TA ATGC CAC C GATGT C6CG CTC CT G 
-MEXICO GGTATGGCAGACGCACAAGGCTTTATGATGCGGGTCACACCGATGTCCGCGCCTCCCTTG 



4030v 4040v 4050v 4060v 4070v 4080v 
-TASHKENT GGCCCCGTACAGGTCACAACCTGTGAGTTGTACGAGCTAG 

GGCCCCGTACAGGT ACAAC TGTGA TTGTACGAGCTA6 
-BURMA CCCGTTTTATCCCGGCCATTGGCCCCGTACAGGTTACAACTTGTGAATTGTACGAGCTAG 
35 C CG TTTAT CC C GG C GT G AC AC TGTGAA T T GAGCT G 

-MEXICO CGCGCTTTATTCCCACTCTCGGGCGGGTTACTGCCACCACCTGTGAACTCTTTGAGCTTG 

4090v 4100v 4110v 4120v 4130v 4140v 
-TASHKENT TGGAGGCCATGGTCGAGA AAGGCCAGGATGGCTCCGCCGTCCTTGAGCTCGATCTCTGCA 
^0 TGGAGGCCATGGTCGAGAA GGCCAGGATGGCTCCGCCGTCCTTGAGCT GATCT TGCA 

-BURMA TGGAGGCCATGGTC6AGAAGGGCCAGGATGGCTCCGCCGTCCTTGAGCTTGATCTTTGCA 
T GAGGC ATGGT GAGAAGGGCCA GA GG TC GCCGTCCT GAG T GAT T TGCA 
-MEXICO TAGAGGCGATGGTGGAGAAGGGCCAAGACGGTTCAGCCGTCCTCGAGTTGGATTTGTGCA 

45 4150v 4I60v 4170v 4180v 4190v 4200v 

-TASHKENT ACCGTGACGTGTCCAGGATCACCTTTTTCCAGAAAGATTGCAATAA6TTCACCACGGGAG 
ACCGTGACGTGTCCAGGATCACCTT TTCCAGAAAGATTG AA AAGTTCACCAC GG G 
-BURMA ACCGTGACGTGTCCAGGATCACCTTCTTCCAGAAAGATTGTAACAAGTTCACCACAGGTG 
CCG GA GT ^CC G AT ACCTT TTCCAGAA GATTGTAACAAGTTCAC AC GG G 
50 -MEXICO GCCGAGATGTCTCCCGCA-AACCTTTTTCCAGAAGGATTGTAACAAGTTCACGACCGGCG 
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-TASHKENT 



-BURMA 



-MEXICO 



AGACC'^i 
AGAC CA"^ 
AGAC Al 
A G A C A A 1 



^CCCATGGTAA 



: ^'^23Cv ^240v 4250v 4260v 

rGGGCCAGGGCATT-CGGCCTGGAGTAAGACCnCTGTG 
G^GGGCCAGGGCATTTCGGCCTGGAG AAGACCTTCTG G 
'GCCCATG3'a:::G^GGGCCAGGGCATCTCGGCCTGGAGCAAGACCTTCTGCG 
tg: ca^gg aaagt gg caggg ATCT CTGGAG aagac tt TG G 

TGCGCATGGCAAAGTCGGTCAGGGTATCTTCCGCTGGAGTAAGACGTTTTGTG 



-TAS-KENT CCCTT' 

ccc^ ^ 
-B'jR"A ccr:^ 
c cc ^ ^ 

-MEXICO CCCT3- 



-tashkent gtgtgt 
gtgtg^ 
-burma gtgtgt 
tgtgt 

-MEXICO CTGTGT 



Cv 42eOv ^290v 4300v 4310v 4320v 
CGGCCCCTGGTTCCGTGCTATTGAGAAGGCTATTCTGGCCCTGCTCCCTCAGG 
tggttccg GCTATTGAGAAGGCTATTCTGGCCCTGCTCCCTCAGG 
'GG'^CCGCGC^attgaGAAGGCTATTCTGGCCCTGCTCCCTCAGG 

^gg^^::g gc attgagaaggctattct ccct t cc ca g 
g3c:::^gg^'ccgtgcgattgagaaggctattctatcccttttaccacaag 



" r r r r ^ 



Ov ^2i:jv 4350v 4360v 4370v 4380v 
TTATGGGGATG:r:TTGATGACACCGTCTTCTCGGCGCGTGTGGCCGCAGCAA 

GG gatg:^:-ttgatgacaccgtcttctcggcg tgtggccgcagcaa 
^tacggtgatgcc^ttgatgacaccgtcttctcggcggctgtggccgcagcaa 
tacgg gatgc ^ tga gac c gt ttctc gc gc gtggc g gc a 
ctacgggga^3c^ta:gacgactcagtattctctgctgccgtggctggcgcca 



4390v 44C0- 4410v 4420v 4430v 4440v 

-TASHKENT aggcgtccatggtgtttgagaatgacttttctga&tttgactccacccagaataattttt 
aggc tccatggtgtttgagaatgacttttctgagtttgactccacccagaataa tttt 
-BURMA aggcatccatggtgtttgagaatgacttttctgagtttgactccacccagaataactttt 

CCATGGTGTTTGA AATGA TtttCTGAGTTTGACTC AC CAGAATAACTTTT 
-MEXICO GCCATGCCATGGTGTTTGAAAATGATTTTTCTGAGTTTGACTCGACTCAGAATAACTTTT 

4450v 4460v 4470v 4480v 4490v 4500v 
-TASHKENT CCCTGGGCCTAGAGTGTGCTATTATGGAGAAGTGTGGGATGCCGAAGTGGCTCATCCGCT 

C CTGGG CTAGAGTGTGCTATTATGGAG AGTGTGGGATGCCG AGTGGCTCATCCGC 
-BURMA CTCTGGGTCTAGAGTGTGCTATTATGGAGGAGTGTGGGATGCCGCAGTGGCTCATCCGCC 

C CT GGTCT GAGTG GC ATTATGGA GAGTGTGG ATGCC CAGTGGCT tc g 
-MEXICO CCCTAGGTCTTGAGTGCGCCATTATGGAAGAGTGTGGTATGCCCCAGTGGCnGTCAGGT 

4510v 4520V 4530v 4540v 4550v 4560v 
-TASHKENT TGTACCACCTTATAAGGTCTGCGTGGATCCTGCAGGCCCCGAAGGAGTCCCTGCGAGGGT 

TGTA CACCTTATAAGGTCTGCGTGGATC TGCAGGCCCC6AAGGAGTC CTGCGAGGGT 
-BURMA TGTATCACCTTATAAGGTCTGCGTGGATCTTGCAGGCCCCGAAGGAGTCTCTGCGAGGGT 

TGTA CA T GGTC GCGTGGATC TGCAGGCCCC AA GAGTCT TG GAGGGT 
-MEXICO TGTACCATGCCGTCCGGTCGGCGTGGATCCTGCAGGCCCCAAAAGAGTCTTTGAGAGGGT 

4570v 4580v 4590v 4600v 4610v 4620v 
-TASHKENT GTTGGAAGAAACACTCCGGTGAGCCCGGCACTCTTCTATGGAATACTGTCTGGAACATGG 
TTGGAAGAAACACTCCGG^GAGCCCGGCACTCTTCTATGGAATACTGTCTGGAA ATGG 
-BURMA TTTGGAAGAAACACTCCGGTGAGCCCGGCACTCTTCTATGGAATACTGTCTGGAATATGG 
T TGGAAGAA CA TC GGTGAGCC GGCA T CT TGGAATAC GT TGGAA ATGG 
-MEXICO TCTGGAAGAAGCATTCTGGTGAGCCGGGCAGCTTGCTCTGGAATACGGTGTGGAACATGG 



4630v 4640v 4650v 4660v 4670v 4680v 
-TASKENT CCGTTATCACCCATTGTTACGATTTCCGCGATTTGCAGGTGGCTGCCTTTAAAGGTGATG 

CCGTTAT ACCCA TGTTA GA TTCCGCGATTT AGGTGGCTGCCTTTAAAGGTGATG 
-BURMA CCGTTATTACCCACTGTTATGACTTCCGCGATTTTCAGGTGGCTGCCTTTAAAGGTGATG 

C T ATT CCCA TG TATGA TTCCG GA T CAGGT GC GCCH AA GG GA G 
-MEXICO CAATCATTGCCCATTGC^ATGAGTTCCGGGACCTCCAGGTTGCCGCCTTCAAGGGCGACG 
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u69:. -''-"v 4720v 4730v 4740v 

-TASH><ENT ATTCGAT;i3TGC^^:GC-^'^:-3'AC:GTrAGAGTCCAGGGGCTGCTGTCCTGATTGCTG 

ATTCGATAG^GC^^'GC-'G'G:-"::" CGTCAGAGTCCAGG GCTGCTGTCCTGAT GC G 
-BURMA ATTCGATAGTGC'^^GC- -*VG':TCG'ri:^:,GTCCAGGAGCTGCTGTCCTGATCGCCG 

A TCG T GT r- :g TAGAG CCAGG GC G T CT AT GC G 

-ME>::CO AC"CGG"CGTC:^:^G^" -^ --'"CCGC '-^^AGCCCAGGCGCCGGrTCGCTTATAGCAG 

a75Cv J77Cv 4780v 4790v 4800v 

-TASHKENT GCTG^GGC^'-^-GC^G;:vrGGr^^::GTCCGATTGGTTTGTATGCAGGTGTTGTGG 

GC'G^GGC" — G "".^'.r G '"::g ccgat ggtttgtatgcaggtgttgtgg 

-BURMA GCTGTGGC^'GAAG^'G:-^:G'^G'^"'C:g:CCGA^CGGTTTGTATGCAGGTGTTGTGG 

gc^g'gg ^tg-^-ag'-g--^ :G G" *^cgg ccgat GG TGTATGC GG GTTGT G 
-mexico gctgtggtt-ga:.gt^o-^^ggc'g;ct^c:ggccgattgggctgtatgccggggttgtcg 

4810v 4S20. 'iSGOv 4840v 4850v 4860v 
-TASHKENT TGACCCCCGGCCT'GGC GCGC^'CCCGACGTCGTGCGCTTGTCCGGCCGGCTTACTGAGA 

TG CCCCCGGCCTTGGCGCGC'TCCCGA GT GTGCGCTTG CCGGCCGGCTTAC GAGA 
-BURMA TGGCCCCCGGCCTTGGCGCGCTCCCTGATGTTGTGCGCTTCGCCGGCCGGCTTACCGAGA 

T GCCCC GG CT GG G: CC GATGT GT CG TTCGCCGG CGGCTT C GAGA 
-MEXICO TCGCCCCGGGGC'CGGGG:C"ACCCGATGTCGTTCGATTCGCCGGACGGCTTTCGGAGA 



25 



4S70v 48c'\^ ^SGOv 4900v 4910v 4920v 
-TASHKENT AGAATTGGGGCCC'GGCCCTGAGCGGGCGGAGCAGCTCCGCCTTGCTGT 

AGAATTGGGGCCCTGGCCCTG^GCGGGCGGAGCAGCTCCGCCT GCTGT 
-BURMA AGAATTGGGGCCC'GGCCCTGAGCGGGCGGAGCAGCTCCGCCTCGCTGTTAGTGATTTCC 

AGAA 'GGGG CCTG CC G^GCGGGC GAGCAGCTCCGCCTCGC GT GATTTCC 
-MEXICO AGA^C^GGGGGCC'GA-C:":AGCGGGCAGAGCAGCTCCGCCTCGCCGTGCAGGATTTCC 



30 



4930v ^9^yj 4950v 4960v 4970v 4980v 

-BURMA tccgcaagctcacgaatgtagctcagatgtgtgtggatgttgtttcccgtgtttatgggg 

TCCG A G T ACGAATGT GC CAGAT TGTGT GA GT GT TC G GTTTA GGGG 
-MEXICO TCCGTAGGTTAACGAATGTGGCCCAGATTTGTGTTGAGGTGGTGTCTAGAGTTTACGGGG 



35 



4990v 50C0v 5010v 5020v 5030v 5040v 
-BURMA TTTCCCCTGGACTCGTTCATAACCTGATTGGCATGCTACAGGCTGTTGCTGATGGCAAGG 
TTTCCCC GG CT GTTCATAACCTGAT GGCATGCT CAG CT TTG TGATGG AAGG 
-MEXICO TTTCCCCGGGTCTGGTTCATAACCTGATAGGCATGCTCCAGACTATTGGTGATGGTAAGG 



40 



5050v 5060v 5070v 5080v 5090v 5100v 
-BURMA CACATTTCACTGAGTCAGTAAAACCAGTGCTCGACTTGACAAATTCAATCTTGTGTCGGG 

C CATTT AC GAGTC GT AA CC T CT GAC T ACA A TCAAT TG CGG 
-MEXICO CGCATTTTACAGAGTCTGT^AAGCCTATACTTGACCTTACACACTCAATTATGCACCGGT 



45 



SllOv 512Cv 5130v 5140V 5150v 5160v 
-BURMA TGGAATG--T:i-C-G'C'TTTGC^GCGCCCATGGGTTCGCGACCATGCGCCCTCGGCCT 
GAATGAATAACATGT tt^GCTGCGCCCATGGGTTCGC ACCATGCGCCCT GGCCT 
-MEXICO CTGAATGAATAACATG^GGTT'GCTGCGCCCATGGGTTCGCCACCATGCGCCCTAGGCCT 



50 



5170v SlSOv 
-BURMA ATTTTGTTGCTGC^CC'C-^ 
TTTTG TG "G ^CC'C 
-MEXICO CTTTTGCTGTTG^^CC'"^ 



5190v 5200v 5210v 5220v 

tgtttt-gcctatgctgcccgcgccaccgcccggtcagccg 
TGTTT tgcctatg tgcccgcgccaccg ccggtcagccg 

"G^'tc^GCCTATGTTGCCCGCGCCACCGACCGGTCAGCCG 
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5230^ 5Z^0v 5250v 5260v 5270v 5280v 
-BURMA TC^GG:CGC.r.^:3^GGGC33CGCAGCGGCGGTTCCGGCGGTGGTTTCTGGGGTGACCGG 
TCTGGCCGCCG^CGTGGGCGGCGCAGCGGCGGT CCGGCGGTGGTTTCTGGGGTGACCGG 

-MEXICO tctggl:3c:gtcgtgggcggcgcagcggcggtaccggcggtggtttctggggtgaccgg 

5 

529:> 53.C". 5310v 5320v 5330v 5340v 
-BUR"A G^TGA "r::G:::^^ CCCC^./^TATTCATCCAACCAACCCCnCGCCCCCGAT 
GTTG'^^'r:AGCC:^^:'-'^-:CCCaATATTCATCCAACCAACCCCTT GCCCC GA 

-MEXICO GT-GA''c^:^Gc::^'^ ::-.;tcccc^atattcatccaaccaacccctttgccccagac 

10 

5350. 535\ 537:"v 53SOv 5390v 5400v 

-BURMA gtcac:3C^gcgg::gg]v:^ggai:c":gt.ttcgccaacccgcccgaccactcggctcc 
GT ccGCTGCG ccGGG :'ggacct:g ttcgccaacc gcccg ccact ggctcc 

-MEXICO GTTGCC3C^GCG'CCGGG^*tggaCCTCGCCTTCGCCAACCAGCCCGGCCACTTGGCTCC 

15 

5410v 5420v 5430v 5440v 5450v 5460v 
20 -BURMA GCTTGGCGTGACCAGGCCCAGCGCCCCGCCGTTGCCTCACGTCGTAGACCTACCACAGCT 

CTTGGCG GA CAGGCCCAGCGCCCC CCG TGCCTC CGTCG GACCT CCACAGC 
-MEXICO ACTTGGCGAGATCAGGCCCAGCGCCCCTCCGCTGCCTCCCGTCGCCGACCTGCCACAGCC 

5470v 5480v 5490v 5500v 5510v 5520v 
25 -BURMA GGGGCCGCGCCGCTAACCGCGGTCGCTCCGGCCCATGACACCCCGCCAGTGCCTGATGTC 

GGGGC GCG CGCT AC GC GT GC CC GCCCATGACACC C CC GT CC GA GT 
-MEXICO GGGGCTGCGGCGCTGACGGCTGTGGCGCCTGCCCATGACACCTCACCCGTCCCGGACGTT 

5530v 5540v 5550v 5560v 5570v 5580v 
30 -BURMA GACTCCCGCGGCGCCATCTTGCGCCGGCAGTATAACCTATCAACATCTCCCCnACCTCT 

GA TC CGCGG GC AT T CGCCG CAGTATAA T TC AC TC CCCCT AC TC 
-MEXICO GATTCTCGCGGTGCAATTCTACGCCGCCAGTATAATTTGTCTACTTCACCCCTGACATCC 

5590v 5600v 5610v 5620v 5630v 5640v 
35 -BURMA TCCGTGGCCACCGGCACTAACCTGGTTCTTTATGCCGCCCCTCTTAGTCCGCTTTTACCC 

TC GTGGCC C GGCACTAA T GT CT TATGC GCCCC CTTA TCCGC T T CC 
-MEXICO TCTGTGGCCTCTGGCACTAATTTAGTCCTGTATGCAGCCCCCCTTAATCCGCCTCTGCCG 

5650V 5660V 5670v 5680v 5690v 5700v 
40 -BURMA CTTCAGGACGGCACCAATACCCATATAATGGCCACGGAAGCTTCTAATTATGCCCAGTAC 

CT CAGGACGG AC AATAC CA AT ATGGCCAC GA GC TC AATTATGC CAGTAC 
-MEXICO CTGCAGGACGGTACTAATACTCACATTATGGCCACAGAGGCCTCCAATTATGCACA6TAC 

5710v 5720v 5730V 5740v 5750v 5760v 
45 -BURMA CGGGTTGCCCGTGCCACAATCCGTTACCGCCCGCTGGTCCCCAATGCTGTCGGCGGHAC 

CGGGTTGCCCG GC AC ATCCGTTACCG CC CT GT CC AATGC GT GG GG TA 
-MEXICO CGGGTTGCCCGCGCTACTATCCGTTACCGGCCCCTAGTGCCTAATGCAGTTGGAGGCTAT 

5770v 57S0v 5790v 5800v 5810v 5820v 
50 -BURMA GCCATCTCCATCTCATTCTGGCCACAGACCACCACCACCCCGACGTCCGnGATATGAAT 

GC AT TCCAT TC TTC^GGCC CA AC ACCAC ACCCC AC TC GHGA ATGAAT 
-MEXICO GCTATATCCATTTC^TTC'GGCCTCAAACAACCACAACCCCTACATCTCnGACATGAAT 
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-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 

-BURMA 
-MEXICO 



I L ^* '-^ 1 ^ ^ ■ ^ ■ J L ■ ■ 

TC AT AC TC AC 
TCCATTACT^CCAC^ 



555C. 5S60v 5870v 5880v 
:g-^— ;G'CCAGCCCGGCATAGCCTCTGAGCTTGTG 
G at; - GT CA CC GGCATAGC TCTGA T GT 

^gga^tcttgttcaacctggcatagcatctgaattggtc 



5590. ■ 590Cv 59i:v 5920v 5930v 5940v 
- CCCiAG^GAGCGCr-^C-:^A-CGTiACCAAGGCTGGCGCTCCGTCGAGACCTCTGGG 
ATCCCAiG GAGCGCCT :-:'A l3 AA CAAGG TGGCGCTC GT GAGAC TCTGG 
ATCCCAAGCGAGCGCr'CAC^ACCGCAATCAAGGTTGGCGCTCGGTTGAGACATCTGGT 

5950v 596:. 5970v 5980v 5990v 6000v 
GTGGC^GAGGAGGAGGC'":r:^GG^r"rTATGCTTTGCATACATGGCTCACTCGTA 
GT GC^GAGGAGGA GC ^CC: GGT^TG^ ATG T TGCATACATGGCTC C GT 
GTTGCTGiGGAGGAAGCC-T^CCGGTrTTGTCATGTTATGCATACATGGCTCTCCAGTT 



6010v 6c:. ■ 

AATTCCTATACTAATAG.:.^ 
AA TCCTATAC AATAC 

aactcctataccaa::.C'"' 



6030v 6040v 6050v 6060v 
:"A'ACCGGTGCCCTCGGGCTGTTGGACTTTGCCCTTGAG 
^ATACCGGTGCCCT GG T TGGACTTTGCC T GAG 
rTATMCCGGTGCCCTTGGCTTACTGGACTTTGCCTTAGAG 



6070v 6Gcr "^0907 6100v 6110v 6120v 

CTTGAGTTTCGCAACCTT^CCCCCGGTAACACCAATACGCGGGTCTCCCGTTATTCCAGC 
CTTGAGTTTCGCAA CT ^CC CC GTAACACCAATAC CG GT TCCCGTTA TCCAGC 
CTTGAGTTTCGCAATCTC-CCACCTGTAACACCAATACACGTGTGTCCCGTTACTCCAGC 

6130v 6140v 6150v 6160v 6170v 6180v 
ACTGCTCGCCACCGCCTTCGTCGCGGTGCGGACG6GACTGCCGAGCTCACCACCACGGCT 
ACTGCTCG CAC C CG G G GACGGGACTGC GAGCT ACCAC AC GC 
ACTGCTCGTCACTCCGCCCGAGGGGCC---GACGGGACTGCGGAGCTGACCACAACTGCA 

6190V 6200v 6210v 6220v 6230v 6240v 
GCTACCCGCTTTATGAAGGACCTCTATTTTACTAGTACTAATGGTGTCGGTGAGATCGGC 
GC ACC G TT ATGAA GA CTC A TTTAC G TAATGG GT GGTGA TCGGC 
GCCACCAGGTTCATGAAAGATCTCCACTTTACCGGCCTTAATGG6GTAGGTGAAGTCGGC 



40 



6250V 6260v 6270v 6280v 6290v 6300v 
-BURMA CGCGGGATAGCCCTCACCCTGTTCAACCTTGCTGACACTCTGCTT6GCGGCCTGCCGACA 
CGC6GGATAGC CT AC T T AACCTTGCTGACAC CT CT GGCGG CT CCGACA 
-MEXICO CGCGGGATAGCTCTAACATTACTTAACCTTGCTGACACGCTCCTCGGCGG6CTCCCGACA 



45 



6310v 6320v 6330v 6340v 6350v 6360v 
-BURMA GAATTGATTTCGTCGGCTGGTGGCCAGCTGTTCTACTCCCGTCCCGTTGTCTCAGCCAAT 
GAATT ATTTCGTCGGCTGG GG CA CTGTT TA TCCCG CC GTTGTCTCAGCCAAT 
-MEXICO GAATTAATTTCGTCGGC'GGCGGGCAACTGTTTTATTCCCGCCCGGTTGTCTCAGCCAAT 



50 



6370v 63SCv 6390v 6400v 6410v 6420v 
-BURMA GGCGAGCCGACTGTTAAGTTGTATACATCTGTAGAGAATGCTCAGCAGGATAAGGGTATT 
GGCGAGCC AC GT AAG T TATACATC GT GAGAATGCTCAGCAGGATAAGGGT TT 
-MEXICO GGCGAGCCAACCGTGAAGCTCTATACATCAGTGGAGAATGCTCAGCAGGATAAGGGTGTT 



55 



6430v 64':0v 6450v 6460v 6470v 6480v 
-BURMA GCAATCCCGCATGACATTGACCTCGGAGAATCTCGTGTGGTTATTCAGGAHATGATAAC 
GC ATCCC CA GA AT GA CT GG GA TC CGTGTGGT ATTCAGGAHATGA AAC 
-MEXICO GCTATCCCCCACGATATCGATCTTGGTGATTCGCGTGTGGTCATTCAGGAUATGACAAC 
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6490v 650'% 65iOv 6520v 6530v 6540v 
-BURMA CA;cA':>a:.::.AG^ ''^G:~:::oAr3CCTTCTCCAGCCCCATCGCGCCCTTTCTCTGTCCTT 

Cm catg-:. c- ga-:gg/: ac cc tc cc gc ccatc cg ccttt tctgt ct 
-MEXICO cagcatg:3caggatcg"v::;c:ccgtcgcctgcgccatctcggcctttttctgttctc 

655Cv 65-.% 657^. 6580v 6590v 6600v 

-burma cgagctaatgatgtgc — tggctctctctcaccgctgccgagtatgaccagtccacnat 
cgagc aatgaigt ctt^ggct 'c ctcac gc gccgagtatgaccagtccactta 

-MEXICO CGAGCAAATGATGTACTTTGuCTGTCCCTCACTGCAGCCGAGTATGACCAGTCCACTTAC 

65i0v 6620v 6630v 6640v 6650v 6660v 
-BURMA GGCTC^'CGACTGGCCCAGTTTATGTTTCTGACTCTGTGACCTTGGTTAATGTTGCGACC 
GG TC 'C ACTGGCCC GTTTAT T TC GAC gtgac ttggt aatgttgcgac 
-MEXICO gggtcgtcaactggcccggtttatatctcggacagcgtgactttggtgaatgttgcgact 

667Cv 668Cv 6690v 6700v 6710v 6720v 

-burma ggcgcgcaggccgtigcccggtcgctcgattggaccaaggtcacacttgacggtcgcccc 
ggcgcgc-ggccgt gcccg tcgct ga tgg ccaa gtcac ct gacgg cg ccc 
-MEXICO ggcgcgcaggccgtagcccgatcgcttgactggtccaaagtcaccctcgacgggcggccc 

6730v 6740v 6750v 6760v 6770v 6780v 

-BURMA ctctccaccatccagc^g'actcgaagaccttctttgtcctgccgctccgcggtaagctc 

CTC C AC T AGCA "A TC AAGAC TTCTTTGT CT CC CT CG GG AAGCTC 
-MEXICO CTCCCGACTGTTGAGCAATATTCCAAGACATTCTTTGTGCTCCCCCTTCGTGGCAAGCTC 

6790v 6800v 6810v 6820v 6830v 6840v 
-BURMA TCTTTCTGGGAGGCAGGCACAACTAAAGCCGGGTACCCTTATAATTATAACACCACTGCT 
TC TT TGGGAGGC GGCACAAC AAAGC GG TA CCTTATAAHATAA AC ACTGCT 
-MEXICO TCCTTTTGGGAGGCCGGCACAACAAAAGCAGGTTATCCTTATAATTATAATACTACTGCT 

6850v 6860v 6870v 6880v 6890v 6900v 
-BURMA AGCGACCAACTGCTTGTCGAGAATGCCGCCGGGCACCGGGTCGCTATTTCCACTTACACC 
AG GACCA T CT T GA AATGC GCCGG CA CGGGTCGC ATTTC AC TA ACC 
-MEXICO AGTGACCAGATTCTGATTGAAAATGCTGCCGGCCATCGGGTCGCCATTTCAACCTATACC 

6910v 6920v 6930v 6940v 6950v 6960v 
-BURMA ACTAGCCTGGGTGCTGGTCCCGTCTCCATTTCTGCGGTTGCCGTTTTAGCCCCCCACTCT 

AC AG CT GG GC GGTCC GTC CCATTTCTGCGG GC GTTTT GC CC C CTC 
-MEXICO ACCAGGCTTGGGGCCGGTCCGGTCGCCATTTCTGCGGCCGCGGTTTTGGCTCCACGCTCC 

6970v 6980v 6990v 7000v 70I0v 7020v 
-BURMA GCGCTAGCATTGCTTGAGGATACCTTGGACTACCCTGCCCGCGCCCATACTTTTGATGAT 

GC CT GC TGCT GAGGATAC TT GA TA CC G CG GC CA AC TTTGATGA 
-MEXICO GCCCTGGCTCTGCTGGAGGATACTTTTGATTATCCGGGGCGGGCGCACACATTTGATGAC 

7030v 7040v' 7050v 7060v 7070v 7080v 
-BURMA TTCTGCCCAGAGTGCCGCCCCCTTGGCCTTCAGGGCTGCGCTTTCCAGTCTACTGTCGCT 
TTCTGCCC GA TGCCGC C T GGCCT CAGGG TG GCTTTCCAGTC ACTGTC6CT 
-MEXICO TTCTGCCCTGAATGCCGCGC^TTAGGCCTCCAGGGTTGTGCTTTCCAGTCAACTGTCGCT 



7090v ZlCHv 7110v 7120v 7130v 7140v 
-BURMA GAGCTTCAGCGCCTTAAGATGAAGGTGGGTAAAACTCGGGAGTTGTAGTTTATTTGCTTG 
GAGCT CAGCGCCTTAA T AAGGTGGGTAAAACTCGGGAGTTGTAGTTTAnTG TG 
-MEXICO GAGCTCCAGCGCCTTAAAGTTAAGGTGGGTAAAACTCGGGAGTTGTAGTTTATTTGGCTG 



41. 



715Cv :'i70v 7180v 7190v 

-BURMA TGCCCCCC"r":^r' '"^^^^'CTCATTTCTGCGTTCCGCGCTCCC 

TGCCC CCT C" TTATTTC TTTCT GT CCGCGCTCCC 

-MEXICO TGCCCACC'AC''AT:':";r",:"-::TTT.;7Tr^yy^yj(^y(^j^(^y(^j.^.g^^^y^.^^ 

-BURMA TG^ 
TGA 

-MEXICO ^G^ 



A numbGr of r^p.iing frames, which are 

potential coding regior^s , na\'e been found within the 
DNA sequences set forth .itove. As has already been 
noted, consensus residues for the RNA-directed RNA 
polymerase (RDRP) were identified in the HEV (Burma) 
strain clone ETl.l. Once a contiguous overlapping set 
of clones was accumulated, it became clear that the 
nonstructural elements containing the RDRP as well as 
what were identified as consensus residues for the 
helicase domain were located in the first large open 
reading frame (ORFI). ORFI covers the 5' half of the 
genome and begins at the first encoded met, after the 
27th bp of the apparent non-coding sequence, and then 
extends 5079 bp before reaching a termination codon. 
Beginning 37 bp downstream from the ORFI stop codon in 
the plus 1 frame is the second major opening reading 
frame (0RF2) extending 1980 bp and terminating 68 bp 
upstream from the point of poly A addition. The third 
forward ORF (in the plus 2 frame) is also utilized by 
HEV. 0RF3 is only 370 bp in length and would not have 
been predicted to be utilized by the virus were it not 
for the identification of the immunoreactive cDNA 
clone 406,4-2 from the Mexico SISPA cDNA library (see 
below for detailed discussion). This epitope 
confirmed the utilization of 0RF3 by the virus, 
although the means by which this ORF is expressed has 
not yet been fully elucidated. If we assume that the 
first kec is utilized, 0RF3 overlaps ORFI by 1 bp at 
its 5' end and 0RF2 by 328 bp at its 3 'end. 0RF2 
contains the broadly reactive 406.3-2 epitope and also 
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a signal sequence i.ts oxtreme 5' end. The first 

half of this 0RF2 also has a high pi value (>10) 
similar to that seen with other virus capsid proteins. 
These data suggest that the 0RF2 might be the 



The existence ot subgenomic transcripts prompted 
a set of exper ir^.ents to ietermine whether these RNAs 
were produced ry splicing from the 5' end of the 
genome. An analysis using subgenomic probes from 
throughout the genome, including the extreme 5' end, 
did not provide evidence for a spliced transcript. 
However, it was discovered that a region of the 
genome displayed a high degree of homology with a 21 
bp segment identified in Sindbis as a probably 
internal initiation site for RNA transcription used in 
the production of its subgenomic messages. Sixteen of 
21 (76%) of the nucleotides are identical. 

Two cDNA clones which encode an epitope of HEV 
that is recognized by sera collected from different 
ET-NANB outbreaks (i.e., a universally recognized 
epitope) have been isolated and characterized- One of 
the clones immunoreac ted with 8 human sera from 
different infected individuals and the other clone 
immunoreacted with 7 of the human sera tested. Both 
clones immunoreacted specifically with cyno sera from 
infected animals and exhibited no immunologic response 
to sera from uninfected animals. The sequences of the 
cDNAs in these recombinant phages, designated 406.3-2 
and 406.4-2 have been determined. The HEV open reading 
frames are shown to encode epitopes specifically 
recognized by sera from patients with HEV infections. 
The cDNA sequences and the polypeptides that they 
encode are set forth below. 

Epitopes derived from Mexican strain of HEV: 

406.4-2 sequence (nucleotide sequence has SEQ ID 
NO. 13; amino acid sequence has SEQ ID NO. 14): 



F 



redominant str 



gene of HEV, 
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SEQ ID NO. 13 : 

c Gcc AAC CAG cc: ~iGC c^c 'rr ctt ggc gag ATC AGG CCC 46 

Ala Asn Gin r-c -^s .e'. ^3 Leu Gly Glu lie Arg Pro 

1 5 10 15 

AGC Gc: cc^ :cG ccc ctg cc:^ :ag ccg ggg ctg 94 



L6"j ^""o G ' ^^'"0 Giy Leu 
^5 30 



CGG CGC CGG^G-G: GC-G:C-:' .^C^CC^CAC CCGTCCCGGA 143 
Arg Arg 



15 CGTTGATTCT CGCGG^^C^A ^^C^CGCCG ^^I^GTATAAT TTGTCTACTT CACCCCTGAC 203 

ATCCTCTGTG GCC^C^GGCA C^AATT^AGT CC^GTATGCA GCCCCCCTTA ATCCGCCTCT 263 
GCCGCTGCAG GACGGT-'^A ^-ac^C-^C-^ '-^"vGCCACA G^GGCCTCCA ATTATGCACA 323 

gtaccgggtt gcccgcgc^a cta-ccg^'-; '::ggccccta gtgcctaatg CAGTTGGAGG 383 

CTATGCTATA TCCA — ctt -CTGjCr'^A -..-.C^ACCACA ACCCCTACAT CTGTTGACAT 443 
25 GAATTC 449 

SEQ ID NO. 14 : 

Ala Asn Gin Pro Gly His Leu Ala Pro Leu Gly Glu He Arg Pro Ser 
30 1 5 10 15 

Ala Pro Pro Leu Pro Pro Val Ala Asd Leu Pro Gin Pro Gly Leu Arg 

20 25 30 

35 Arg . 

406.3-2 sequence (nucleotide sequence has SEQ 
ID NO. 15; amino acid sequence has SEQ ID NO, 16): 
SEQ ID NO. 15 : 

GGAT act ITT GAT TAT CCG GGG CGG GCG CAC ACA TIT GAT GAG TTC TGC 49 
Thr Phe Asp Tyr Pro Gly Arg Ala His Thr Phe Asp Asp Phe Cys 
15 10 15 

CCT GAA TGC CGC GCT TTA GGC CTC CAG GGT TGT GCT TTC CAG TCA ACT 97 
Pro Glu Cys Arg Ala Leu Gly Leu Gin Gly Cys Ala Phe Gin Ser Thr 
20 25 30 

GTC GCT GAG CTC CAG CGC CTT AAA GTT AAG GTT 130 
Val Ala Glu Leu Gin Arg Leu Lys Val Lys Val 

35 " 40 
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SEQ ID NO. 16 : 

Thr Phe Asp "Tyr Pro 3'y -rg " ^. - ■ "-^^ pk^ Asp Asp Phe Cys Pro 
1 5 10 15 

5 

Glu Cys Arg A^a ^eu Gly Leu G'^^ Cys Ala Phe Gin Ser Thr Val 

20 :5 30 

Ala Glu Leu Gin Arg Leu Lys Vai Lys Val 
10 35 ^0 

The universal nature of these epitopes is 
evident from the homology exhibited by the DNA that 
encodes them. If the epitope coding sequences from 

15 the Mexican strains shown above are compared to DNA 
sequences from other strains, such as the Burmese 
strain also set forth above, similarities are 
evident, as shown in the following comparisons. 
Comparison of 406.4-2 epitopes, HEV Mexico and Burma strains 

20 10 20 30 

MEXICAN{SEQ ID NO. 17) ANQPGHLAPLGE IRPSAPPLPPVADLPQPGLRR 



25 



BURMA(SEQ ID NO. 18) ANPPDHSAPLGVTRPSAPPLPHWDLPQLGPRR 

10 20 30 

There is 73.5% identity in a 33-amino acid overlap. 



Comparison of 406.3-2 epitopes, HEV Mexico and Burma strains 
MEXICAN(SEQ ID No. 19) 
30 10 20 30 40 

tfdypgrahtfddfcpecraLglqgcafqstvaelqrlkvkv 



TLDYPARAHTFDDFCPECRPLGLQGCAFQSTVAELQRLKMKV 
10 20 30 40 

35 BURMA(SEQ id No. 20) 

There is 90.5% identity in the 42-amino acid overlap. 

It will be recognized by one skilled in the 

art of molecular genetics that each of the specific 

DNA sequences given above shows a corresponding 
4 0 complementary DNA sequence as well as RNA sequences 

corresponding to both the principal sequence shown and 
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other sources of 7ep.et: "- '^n^orial), as is well known 
in the art . 

3. Two ammo acii .--}a;^encGS or two nucleotide 
sequences (in an alternative definition for homology 

5 between two nucleotide sequences) are considered 

homologous (as this terir is preferably used in this 
specification) if they have an alignment score of >5 
(in standard deviation units) using the program ALIGN 
with the mutation gap matrix and a gap penalty of 6 or 

10 greater. See Dayhoff, M.O,, m Atlas of Protein 
Sequence and Structure (157 2) Vol. 5, National 
Biomedical Research Foundation, pp. 101-110, and 
Supplement 2 to this v^olume, pp. 1-10. The two 
sequences (or parts thereof, preferably at least 30 

15 amino acids in length) are more preferably homologous 
if their amino acids are greater than or equal to 50% 
identical when optimally aligned using the ALIGN 
program mentioned above. 

4. A DNA fragment is 'derived from" an ET-NANB 
20 viral agent if it has the same or substantially the 

same basepair sequence as a region of the viral agent 
genome . 

5. A protein is "derived from" an ET-NANB viral 
agent if it is encoded by an open reading frame of a 

25 DNA or RNA fragment derived from an ET-NANB viral 
agent . 

II . Obtaining Cloned ET-NANB Fragments 

According to one aspect of the invention, it has 

30 been found that a virus-specific DNA clone can be 
produced by (a) isolating RNA from the bile of a 
cynomolgus monkey having a known ET-NANB infection, 
(b) cloning the cDNA fragments to form a fragment 
library, and (c) screening the library by 

35 differential hybridization to radiolabeled cDNAs from 
infected and non-infected bile sources. 
A. cDNA Fragment Mixture 
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ET-NANB intecti.'"n : cynoiro^gus monkeys is 
initiated by ir.cci: lar, -inii^.ais intravenously 

with a 10% v/'v suscer.si-p. fr^r^. hunap case stools 
positive for 2'^-'}^. pr^ r:T-::Ai:B rirtiicles (mean diameter 
5 32 nin) . An infec:,-.. ■: ari"^^;; :s -■pnLi-:red for elevated 
levels of alanirv^ a:^ir- p r-^r;3 f ^-r ase , indicating 
hepatitis infect,: or. pT - VAN'F^^ :::fc'^:::on is confirmed by 
immunospec i f ic h : rvi : n : : os 1 1 1 ve antibodies to 

virus-like parti::le.-= ac::'?rdLng to published 

10 methods (Gravelle;. Briefly, a stool (or bile) 

specimen taken from the infected animal 3-4 weeks 
after infecticn is diluted 1:10 with phosphate- 
buffered saline, and the lOt suspension is clarified 
by low-speed centr i f ugat ion and filtration 

15 successively through 1.2 and 0.45 micron filters. The 
material may be further purified by pelleting through 
a 30% sucrose cushion :r3i*adley). The resulting 
preparation of VXPs is mixed vith diluted serum from 
human patients with kno-.vn ET-NANB infection. After 

20 incubation overnight, the mixture is centrifuged 

overnight to pellet immune aggregates^ and these are 
stained and examined by electron microscopy for 
antibody binding to the VLPs . 

ET-NANB infection can also be confirmed by 

25 seroconversion to vx,P-pos i t ive serum. Here the serum 

of the infected animal is mixed as above with 27-34 nm 
VLPs isolated from the stool specimens of infected 
human cases and examined by immune electron microscopy 
for antibody binding to the VLPs. 

30 Bile can be collected from ET-NANB positive 

animals by either cannulating the bile duct and 
collecting the bile fluij or by draining the bile 
duct during necropsy. Total RNA is extracted from the 
bile by hot phenol extraction, as outlined in Example 

35 lA. The RNA fragments are used to synthesize 

corresponding duplex cBriA fragments by random priming, 
also as referenced m Exfi^^ple lA. The cDNA fragments 
may be fractionated by gel electrophoresis or density 
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gradient cent r i t ii ^a t obtain a desired size class 

of fragments, e.g., 500-4,000 basepair fragments. 

Although alternative sources of viral material, 
such as VLPs obtained fron^. stool samples (as 
5 described in ^xai^rie 4 ; , may be used for producing a 

CDNA fraction, the bile source is preferred. According 
to one aspect of the invention, it has been found that 
bile from ET-MAl^.T^-infe-rtei monkeys shows a greater 
number of intact viral particles than material 
10 obtained fromi stcic>l samples, as evidenced by immune 
electron microscopy. Bile obtained from an ET-NANB 
infected human or oyncm-lgus macaque, for use as a 
source of ET-NANB viral protein or genomic material, 
or intact virus, forms part of the present invention. 

15 

B. cDNA Library and Screening 

The cDNA fragments from above are cloned into a 
suitable cloning vector to form a cDNA library. This 
may be done by equipping blunt-ended fragments with a 

20 suitable end linker, such as an EcoRI sequence, and 

inserting the fragments into a suitable insertion site 
of a cloning vector, such as at a unique EcoRI site. 
After initial cloning, the library may be re-cloned, 
if desired, to increase the percentage of vectors 

25 containing a fragment insert. The library construction 
described in Example IB is illustrative. Here cDNA 
fragments were blunt-ended, equipped with EcoRI ends, 
and inserted into the EcoRl site of the lambda phage 
vector gtlO. The library phage, which showed less than 

30 5% fragment inserts, was isolated, and the fragment 
inserts re-cloned into the lambda gtlO vector, 
yielding more than 95% insert-containing phage. 

The cDNA library is screened for sequences 
specific for ET-NANB by differential hybridization to 

35 cDNA probes derived from infected and non-infected 

sources. cDNA fragments from infected and non-infected 
source bile or stool viral isolates can be prepared as 
above. Radiolabeling the fragments is by random 
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labeling, nick t ra n s 1 a t : : p. , '"^r ^^na Labeling, 
according to conventional :nor.:v?;:s (Maniatis, p. 109). 
The cDNA library from ai- is screened by transfer to 

duplicate nitrocel li:lo?o filters, and hybridization 
5 with both infected-source and non- infected-source 

(control) radiolabeled probes, as detailed in Example 
2. In order to recover sequences that hybridize at the 
preferred outer limit of 25-30% basepair mismatches, 
clones can be selected ir they hybridize under the 

10 conditions described in I'.aniatis e_t a 1 . , op . cit . , pp. 
320-323, but usinc tne following wash conditions: 2 x 
sec, 0.1% SDS, room temperature - twice, 30 minutes 
each; then 2 x SCC , 0.1% SDS, 50°C - once, 30 minutes; 
then 2 x SCC , room tempervature - twice, 10 minutes 

15 each. These conditions allowed identification of the 
Mexican isolate discussed above using the ETl.l 
sequence as a probe. Plaques which show selective 
hybridization to the infected-source probes are 
preferably re-plated at low plating density and re- 

20 screened as above, to isolate single clones which are 
specific for ET-NANB sequences. As indicated in 
Example 2, sixteen clones which hybridized 
specifically with infected-source probes were 
identified by these procedures. One of the clones, 

25 designated lambda gtlOl.l, contained a 1.33 kilobase 
fragment insert. 

C. ET-NANB Sequences 

The basepair sequence of cloned regions of the 

30 ET-NANB fragments from Part B are determined by 
standard sequencing methods. In one illustrative 
method, described in Example 3, the fragment insert 
from the selected cloning vector is excised, isolated 
by gel electrophoresis, and inserted into a cloning 

35 vector whose basepair sequence on either side of the 
insertion site is known. The particular vector 
employed in Example 3 is a pTZKFl vector shown at the 
left in Figure 1. The ET-NANB fragment from the gtlO- 
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1.1 phage was inserted the unique EcoRI site of the 

pTZKFl piasmid, Rec or.ir i r. in s cfirrymg Tihe desired 
insert were identified bv hybridization with the 
isolated 1.33 kilcbase frioment, as described in 
5 Example 3. One seiectied plasmici, identified as pTZKFl 
(ETl.l), gave the expected I. 3 A kb fragment after 
vector digestion vith EcrRI. E. coli strain BB4 
infected with the pTZ K F 1 ^ ET 1 . I i plasmid has been 
deposited with the Ameriran Type Culture Collection, 

10 Rockville, MD, and is identified by ATCC deposit 
number 67 7 17. 

The pTZKFl ( ETl . 1 ) plasmid is illustrated at the 
bottom in Figure 1, The fragment insert has 5' and 3' 
end regions denoted at A and C, respectively, and an 

15 intermediate region, denoted at B. The sequences in 
these regions were determined by standard dideoxy 
sequencing and were set forth in an earlier 
application in this series . The three short sequences 
(A, B, and C) are from the same insert strand. As will 

20 be seen in Example 3, the B-region sequence was 

actually determined from the opposite strand, so that 
the B region sequence shown above represents the 
complement of the sequence in the sequenced strand. 
The base numbers of the partial sequences are 

25 approximate. 

Later work in the laboratory of the inventors 
identified the full sequence, set forth above. 
Fragments of this total sequence can readily be 
prepared using restriction endonucleases . Computer 

30 analysis of both the forward and reverse sequence has 
identified a number of cleavage sites. 

III. ET-NANB Fragments 

According to another aspect, the invention 
35 includes ET-NANB-spec i f ic fragments or probes which 
hybridize with ET-NANB genomic sequences or cDNA 
fragments derived therefrom. The fragments may include 
full-length cDNA fragments such as described in 
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Section II, or may ::*^r from shorter sequence 

regions within clr:::eo ^l::a :rrigments. Shorter 
fragments can be prepar^^^. ::y enzymatic digestion of 
full-length frajmentr i^^. :':r 'renditions which yield 
5 desired-sized f r a'''T^v:nr,.^ , ■ v.-.:. 11 be described in 

Section IV. A I te ma : v-: . y , *::^:e fragm.ents can be 
produced by o 1 1 ;:'~i:v: : 1 ^ \ : >: \-p, " ':e t ic methods, using 

sequences derivel :r'^ '^-^ rr::A rraomients. Methods or 

commercial servi':-rH : ^ : i:^:^vj s-i 1 ec ted- sequence 

10 oligonucleotide f r a '::^-?^r : j ar- ava.ilable. Fragments 
are usually at least. a : e^ t irios m length, 

preferably at leas^: 14, : y :n ^-^r 50 nucleotides, when 
used as probes. Probe^^ aan be full length or less 
than 500, preferably less than 300 or 200, nucleotides 

15 in length. 

To confirm thar a given ET-NANB fragment is 
in fact derived from the ET-NANB viral agent, the 
fragment can be shown to hybridize selectively with 
cDNA from infected sources. By way of illustration, to 

20 confirm that the 1.33 kb fragment in the pTZKFl ( ETl . 1 ) 
plasmid is ET-NANB in origin, the fragment was excised 
from the pTZKFl ( ETl . 1 ) plasmid, purified, and 
radiolabeled by random labeling. The radiolabeled 
fragment was hybridized vith fractionated cDNAs from 

25 infected and non-infected sources to confirm that the 
probe reacts only with : p. f ected-source cDNAs . This 
method is illustrated in example 4, where the above 
radiolabeled 1,33 kb fragment from pTZKFl ( ETl . 1 ) 
plasmid was examined for binding to cDNAs prepared 

30 from infected and non-infected sources. The infected 
sources are (1) bile from a cynomolgus macaque 
infected with a strain of virus derived from stool 
samples from human patients from Burma with known ET- 
NANB infections and (2' a viral agent derived from the 

35 stool sample of a human ?:t-NANB patient from Mexico. 
The cDNAs in each fra^m^--^^ mixture were first 
amplified by a linker/ primer amplification method 
described in Example 4 . y^ragment separation was on 
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agarose gel, fcL^ov;^?-- :^^-^'i^hern blotting and then 
hybridization to bmc ^,r.o radLolabeled 1.33 kb 
fragment to the f rao t icna te^-i cCNAs. The lane 
containing '^^'^^As from the mfeoted sources showed a 
5 smeared band or bcuna probe, as expected (cDNAs 

amplified by the 1 inker /' primer amplification method 
would be expected to have a broad range of sizes). No 
probe binding to the aTip.ified cDNAs from the non- 
infected sources was obsoived. The results indicate 
10 that the 1.33 kb probe is specific for cDNA fragments 
associated with ET-NArJR infection. This same type of 
study, using ET 1.1 as tne probe, has demonstrated 
hybridizaticr to ET-rJANB samples collected from 
Tashkent, Somalia, Borneo and Pakistan. Secondly, the 
15 fact that the probe is specific for ET-NANB related 
sequences derived from d:fferent continents (Asia, 
Africa and North America) indicates the cloned ET-NANB 
Burma sequence (ETl.lj is derived from a common ET- 
NANB virus or virus class responsible for ET-NANB 
20 hepatitis infection worldwide. 

In a related confirmatory study, probe 
binding to fractionated genomic fragments prepared 
from human or cynomolgus macaque genomic DNA (both 
infected and uninfected) was examined. No probe 
25 binding was observed to either genomic fraction, 
demonstrating that the ET-NANB fragment is not an 
endogenous human or cynomolgus genomic fragment and 
additionally demonstrating that HEV is an RNA virus. 

Another confirmation of ET '^ANB specific 
30 sequences in the fragmen",s is the ability to express 
ET-NANB proteins from coding regions in the fragments 
and to demonstrated specific sero-reactivity of these 
proteins with sera collected during documented 
outbreaks of ET-NANB. Section IV below discusses 
35 methods of protein expression using the fragments. 

One important use of the ET-NANB-specif ic 
fragments is for identifying ET-NANB-derived cDNAs 
which contain additional sequence information. The 
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. ,on;,3, in turn, yield new fragment 
newly ident.f.ed cDNAs ^^^^^^^^^ ,,3 ent.re 

probes, ^^^--^^2^;;,,,,, 3equenced. Procedures 
vir.l genome .s ^^^^ ^^,^^3 ,,,,ary clones and 

,or identifying ^211^21^^^^ ^^nerally follow tae 

,and oligonucleotides prepared 
The fragments (and oiig useful as 

- -^rz::^:^ of 

primers for a P^^^^^J^ ,,,erial in a patient 

detecting ET-NANB v.ra gen ^^^^^^^^^ 
sample. This diagnostic metho 

section V below. sequences derived from 

TWO specific genetxc sequen 

strain identified herein as 406. J 
the Mexican strain, ^^^^ i^unogenic 

epitopes. This i„gieally react 

encode -Pi-^-^^f^^^ro^ individuals and 
specifically -th ccparison of 

experimental animals infec 

- -°--^"::::r;.: „cer indicate t.at t.ese 
collection o£ genetic seq sequences are 

,,.al se^snces are ,,esence of 

unique, they can ^ strain o£ hepatitis fro» 

HEV and to distinguish this ^^^^ 

„,v, HBV. and HCV -^;;;„„,J„,,,e pro^s to 

useful for the design o olig ^^^^ 

""^""rfrtre'ryr-is "f polypoptides that 
be used for the syn „^^^savs The specific 

themselves are used in — --J^^ incorporated into 
406.3-2 and 406.4-2 sequences can be P 

. T enrh as vectors, xuj- ^ 
other genetic material, such 

expression or -P^"":"",, ^Uylng similar antigenic 
demonstrated aho , 0 dentify ^g^^^^^_ ^^^^ 

regions encoded by reia 
Burmese strain. 



IV. F'^-*"^^^ Protein s 
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As indicated above, ET-NANB proteins can be 
prepared by expressing open reading- frame coding 
regions in ET-NANB fragments. In one preferred 
approach, the ET-NANB fragments used for protein 
5 expression are derived from cloned cDNAs which have 
been treated to produce ies i red-s i ze fragments, and 
preferably random fragments with sizes predominantly 
between about lOn to about 300 base pairs. Example 5 
describes the preparation of such fragments by DNAs 
10 digestion. Because it is desired to obtain peptide 

antigens of between about 30 to about 100 amino acids, 
the digest fragments are preferably size 
fractionated, for example by gel electrophoresis, to 
select those in the approximately 100-300 basepair 
15 size range. Alternatively, cDNA libraries constructed 
directly from HEV-containing sources (e.g., bile or 
stool) can be screened directly if cloned into an 
appropriate expression vector (see below). 

For example, the ET-NANB proteins expressed 
20 by the 406.3-2 and 406.4-2 sequences (and peptide 

fragments thereof) are particularly preferred since 
these proteins have been demonstrated to be 
immunoreactive with a variety of different human sera, 
thereby indicating the presence of one or more 
25 epitopes specific for HEV on their surfaces. These 

clones were identified by direct screening of a gtll 
library. 

A. Expression Vector 

30 The ET-NANB fragments are inserted into a 

suitable expression vector. One exemplary expression 
vector is lambda gtll, which contains a unique EcoRI 
insertion site 53 base pairs upstream of the 
translation termination codon of the beta- 

35 galactosidase gene. Thus, the inserted sequence will 
be expressed as a beta-ga lac tos idase fusion protein 
which contains the N-terminai portion of the beta- 
galactosidase gene, the heterologous peptide, and 
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■ the C-terr.xnaI region of the beta- 

optionally the ^ ^^^^^^^ portion being 

expressed ^ „a„slation termination 
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The viral genon^,ic library formed above is 
screened for production of peptide antigen (expressed 
as a fusion protein) which is immunoreact ive with 
antiserum from ET-NANB seropositive individuals . In 
5 a preferred screening method, host cells infected with 
phage library vectors are plated, as above, and the 
plate is blotted with a nitrocellulose filter to 
transfer recombinant protein antigens produced by the 
cells onto the filter. The filter is then reacted with 
10 the ET-NANB antiserum, washed to remove unbound 

antibody, and reacted with reporter- labeled , anti- 
human antibody, which becomes bound to the filter, in 
sandwich fashion, through the anti-ET-NANB antibody. 

Typically phage plaques which are identified 
15 by virtue of their production of recombinant antigen 
of interest are re-examined at a relatively low 
density for production of antibody-reactive fusion 
protein. Several recombinant phage clones which 
produced immunoreac tive recombinant antigen were 
20 identified in the procedure. 

The selected expression vectors may be used 
for scale-up production, for purposes of recombinant 
protein purification. Scale-up production is carried 
out using one of a variety of reported methods for (a) 
25 lysogenizing a suitable host, such as coli, with a 
selected lambda gtll recombinant (b) culturing the 
transduced cells under conditions that yield high 
levels of the heterologous peptide, and (c) purifying 
the recombinant antigen from the lysed cells. 
30 In one preferred method involving the above 

lambda gtll cloning vector, a high-producer coli 
host, BNN103, is infected with the selected library 
phage and replica plated on two plates. One of the 
plates is grown at 32°C, at which viral lysogeny can 
35 occur, and the other at 42°C, at which the infecting 
phage is in a lytic stage and therefore prevents cell 
growth. Cells which grow at the lower but not the 
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higher temperature are therefore assumed to be 
successfully lyscaenizeci . 

The lysogenized host cells are then grown 
under liquid culture conditions vhich favor high 
5 production of the fused protein containing the viral 
insert, and lysed by rapid freezing to release the 
desired fusion protein. 

C . Peptide Pun f icat ion 

10 The recombinant peptide can be purified by 

standard protein purification procedures which may 
include differential precipitation, molecular sieve 
chromatography , ion -exchange chromatography, 
isoelectric focusing, gel electrophoresis and 

15 affinity chromatography. In the case of a fused 

protein, such as the beta-ga lac tosidase fused protein 
prepared as above, the protein isolation techniques 
which are used can be adapted from those used in 
isolation of the native protein. Thus, for isolation 

20 of a soluble betagalactos idase fusion protein, the 
protein can be isolated readily by simple affinity 
chromatography, by passing the cell lysis material 
over a solid support having surface-bound anti-beta- 
ga lac tosidase ant ibody . 

25 

D. Viral Proteins 

The ET-NANB protein of the invention may 
also be derived directly from the ET-NANB viral agent. 
VLPs or protein isolated from stool or liver samples 

30 from an infected individual, as above, are one 

suitable source of viral protein material. The VLPs 
isolated from the stool sample may be further purified 
by affinity chromatography prior to protein isolation 
(see below). The viral agent may also be raised in 

35 cell culture, which provides a convenient and 

potentially concentrated source of viral protein. Co- 
owned U.S. Patent Application Serial No. 846,757, 
filed April 1, 1986, describes an immortalized trioma 
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liver cell which supports NANB infection in cell 
culture. The trioma cell line is prepared by fusing 
human liver cells with a mouse/human fusion partner 
selected for human chromcsome stability. Cells 
5 containing the desired NANB viral agent can be 

identified by ijiiinuro f lucrGcccnce methods, employing 
anti-ET-NANB hum.an antibodies. 

The viral agent is disrupted, prior to 
protein isolation, by conventional methods, which can 

10 include sonication, high- or low-salt conditions, or 
use of detergents . 

Purification of ET-NANB viral protein can be 
carried out by affinity chromatography, using a 
purified anti-ET-NANB antibody attached according to 

15 standard methods to a suitable solid support. The 
antibody itself may be purified by affinity 
chromatography, where an immunoreac tive recombinant 
ETNANB protein, such as described above, is attached 
to a solid support, for isolation of anti-ET-NANB 

20 antibodies from an immune serum source. The bound 
antibody is released from the support by standard 
methods . 

Alternatively, the anti-ET-NANB antibody may 
be an antiserum or a monoclonal antibody (Mab) 

25 prepared by immunizing a mouse or other animal with 
recombinant ETNANB protein. For Mab production, 
lymphocytes are isolated from the animal and 
immortalized with a suitable fusion partner, and 
successful fusion products which react with the 

30 recombinant protein immunogen are selected. These in 
turn may be used in affinity purification procedures, 
described above, to obtain native ET-NANB antigen. 

V. Utility 

35 Although ET-NANB is primarily of interest 

because of its effects on humans, recent data has 
shown that this virus is also capable of infecting 
other animals, especially mammals. Accordingly, any 
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discussion herein cf utility applies to both human and 
veterinary uses, oFpecially commercial veterinary 
uses, such as the diagnosis and treatment of pigs, 
cattle, sheep, horses, and other domesticated animals. 
5 A. Diagnos t ic Methods 

The particles and antigens of the invention, 
as well as the genetic material ^ can be used in 
diagnostic assays. Methods for detecting the presence 
of ET-NANB hepatitis comprise analyzing a biological 

10 sample such as a blood sample, stool sample or liver 
biopsy specimen for the prosonce of an analyte 
associated with ET-NANB hepatitis virus. 

The analyte can be a nucleotide sequence 
which hybridizes with a probe comprising a sequence of 

15 at least about 16 consecutive nucleotides, usually 30 
to 200 nucleotides, up to substantially the full 
sequence of the sequences shown above (cDNA 
sequences). The analyte can be RNA or cDNA. The 
analyte is typically a virus particle suspected of 

20 being ET-NANB or a particle for which this 

classification is being ruled out. The virus particle 
can be further characterized as having an RNA viral 
genome comprising a sequence at least about 70% 
homologous to a sequence of at least 12 consecutive 

25 nucleotides of the "forward" and "reverse" sequences 
given above, usually at least about 80% homologous to 
at least about 50 consecutive nucleotides within the 
sequences, and may comprise a sequence substantially 
homologous to the full-length sequences. In order to 

30 detect an analyte, where the analyte hybridizes to a 
probe, the probe may contain a detectable label. 
Particularly preferred for use as a probe are 
sequences of consecutive nucleotides derived from the 
406.3-2 and 406.4-2 clones described herein, since 

35 these clones appear to be particularly diagnostic for 
HEV. 

The analyte can also comprise an antibody 
which recognizes an antigen, such as a cell surface 
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antigen, on a ET-r:ArJB vir;:.^ particle. The analyte can 
also be a ET-NATJB viral an* : aen . Where the analyte is 
an antibody or an antrgep. , 'Either a labelled antigen 
or antibody, respectively, :an be used to bind to the 
analyte to .^-m an imr^.up. ^ ; '"^c ica 1 complex, which can 
then be detected by mea:;5^ -^^z trie label. 

Typically, methoos for detecting analytes 
such as surface antigens dnd/or wnole particles are 
based on iminunoassays . Inununoas says can be conducted 
either to determine the presence of antibodies in the 
host that have arisen frrm infection by ET-NANB 
hepatitis virus or by assays that directly determine 
the presence of virus particles or antigens. Such 
technigues are well known and need not be described 
here in detail . Examples include both heterogeneous 
and homogeneous immunoassay techniques. Both 
techniques are based on the formation of an 
immunological complex between the virus particle or 
its antigen and a corresponaing specific antibody. 
Heterogeneous assays for viral antigens typically use 
a specific monoclonal or polyclonal antibody bound to 
a solid surface. Sandwich assays are becoming 
increasingly popular. Homogeneous assays, which are 
carried out in solution without the presence of a 
solid phase, can also be used, for example by 
determining the difference in enzyme activity brought 
on by binding of free antibody to an enzyme-antigen 
conjugate. A number of suitable assays are disclosed 
in U.S. Patent Nos . 3 , 817 , 837 , 4 , 005 , 360 , 3, 996 , 345. 

When assaying for the pres^ \ce of antibodies 
induced by ET-NANB viruses, the viruses and antigens 
of the invention can be used as specific binding 
agents to detect either I gG or IgM antibodies. Since 
IgM antibodies are typically the first antibodies that 
appear during the course of an infection, when IgG 
synthesis may not yet have been initiated, 
specifically distinguishing between IgM and IgG 
antibodies present in the blood stream of a host will 
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enable a physician or other investigator to determine 
whether the infection is recent or convalescent. 
Proteins expressed by the 406.3-2 and 406.4-2 clones 
described herein and peptide fragments thereof are 
5 particularly preferred for use as specific binding 
agents to detect antibodies since they have been 
demonstrated to be reacti\'e vith a number of different 
human HEV sera. Further, tr.ey are reactive with both 
acute and convaiescent sera. 
10 In one diagnostic configuration, test serum 

is reacted with a solid phase reagent having surface- 
bound ET-NANB protein antigen. After binding anti-ET- 
NANB antibody to the reagent and removing unbound 
serum components by washing, the reagent is reacted 
15 with reporter- labeled anti-human antibody to bind 

reporter to the reagent in proportion to the amount of 
bound anti-ET-NANB antibody on the solid support. The 
reagent is again washed to remove unbound labeled 
antibody, and the amount of reporter associated with 
20 the reagent is determined. Typically, the reporter is 
an enzyme which is detected by incubating the solid 
phase in the presence of a suitable fluorometric or 
colorimetric substrate . 

The solid surface reagent in the above assay 
25 prepared by known techniques for attaching protein 

material to solid support material, such as polymeric 
beads, dip sticks, or filter material. These 
attachment methods generally include non-specific 
adsorption of the protein to the support or covalent 
30 attachment of the protein, typically through a free 
amine group, to a chemically reactive group on the 
solid support, such as an activate carboxyl, hydroxyl, 
or aldehyde group. 

In a second diagnostic configuration, known 
35 as a homogeneous assay, antibody binding to a solid 
support produces some change in the reaction medium 
which can be directly detected in the medium. Known 
general types of homogeneous assays proposed 



20309587 
040591 



62 



heretofore inciu'ie -^i :r.-ldbeled reporters, where 

antibody binding t. r"^ ^, he '^:r\iien is detected by a 
change in reported rnobil:ty (broadening of the spin 
splitting peaks ^ , 'b' fluorescent reporters, where 
binding is detected by a charioe m fluorescence 
efficiency, (cj enzyme reporters, where antibody 
binding effects enzyme/substrate interactions, and (d) 
liposome-bound reporters, ^hGre binding leads to 
liposome lysis and relea.^e ct encapsulated reporter. 
The adaptation of ^nese '-^-tiv-^.s to the protein antigen 
of the present in\-enti-n tallows conventional methods 
for preparing hcnicgenecus :-:issay reagents. 

In each - f the ass.-^ys described above, the 
assay method involves rr ,cti!^g the serum from a test 
individual with the protei.n antigen and examining the 
antigen for the presence of bound antibody. The 
examining may involve attaching a labeled anti-human 
antibody to the antibody being examined, either IgM 
(acute phase) or IgG (convalescent phase), and 
measuring the amount of reporter bound to the solid 
support, as in the first method, or may involve 
observing the effect of antibody binding on a 
homogeneous assay reagent, as in the second method. 

Also forming part of the invention is an 
assay system or kit for carrying out the assay method 
just described. The kit generally includes a support 
with surface-bound recombinant protein antigen which 
is (a) immunoreac t ive with antibodies present in 
individuals infected with enterically transmitted 
nonA/nonB viral agent and (b) derived from a viral 
hepatitis agent whose genome contains a region which 
is homologous to the 1.33 kb DNA EcoRI insert present 
in plasmid pTZKF 1 ( ETl . 1 ) carried in Coli strain 
BB4, and having ATCC deposit no. 67717. A reporter- 
labeled anti-human antibody in the kit is used for 
detecting surface-bound anti-ET-NANB antibody. 
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^' Viral Genonie [) i, a a n o s r : z i\ ppli cations 

The ger.o-.ic :T>at:-riaL or the invention can 
itself be used in numerous assays as probes for 
genetic material present ir naturally occurring 
infections. One method for amplification of target 
nucleic acids, for later analysis by hybridization 
assays, is knovn as the polym.erase chain reaction or 
PGR technique. The PGR technique can be applied to 
detecting virus particles of the invention in 
suspected pathological samples using oligonucleotide 
primers spaced apart from each other and based on the 
genetic sequence set fjrth above. The primers are 
complementary to opposite strands of a double stranded 
DNA molecule and are typically separated by from about 
50 to 450 nt or more (usually not more than 2000 nt) . 
This method entails preparing the specific 
oligonucleotide primers and then repeated cycles of 
target DNA denatura t ion , primer binding, and 
extension with a DNA polymerase to obtain DNA 
fragments of the expected length based on the primer 
spacing. Extension products generated from one prdLmer 
serve as additional target sequences for the other 
primer. The degree of amplification of a target 
sequence is controlled by the number of cycles that 
are performed and is theoretically calculated by the 
simple formula 2^^ where n is the number of cycles. 
Given that the average efficiency per cycle ranges 
from about 65% to 85%, 25 cycles produce from 0 . 3 to 
4.8 million copies of the target sequence. The PGR 
method is described in a number of publications, 
including Saiki et al., Science (1985) 230:1350-1354; 
Saiki et al . , Nature (1986) 324:163-166; and Scharf et 
al., Science (1986) 233:1076-1078. Also see U.S. 
Patent Nos. 4,683,194; 4,683,195; and 4,683,202. 

The invention includes a specific diagnostic 
method ior determination of ET-NANB viral agent, based 
on selective amplification of ET-NANB fragments. This 
method employs a pair of single-strand primers derived 
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from non-homologous regiT^r.s of opposite strands of a 
DNA duplex fragment, vni:n :r turn is derived from an 
enterically transnitted viral hepatitis agent whose 
genome contains a region vnich ls homologous to the 
5 1.33 kb DNA EcoRI i-ser^: present m plasmid 

pTZKFl(ETl. 1) carried in ^ coli scram BB4 , and 
having ATCC deposit no. ~ 1 . These "primer 
fragments," which :crr> -ne -spec:: of the invention, 
are prepared from fragmencs such as described 

10 in Section III above. The -ethod follows the process 

for amplifying selected nucleic acid sequences as 
disclosed in U.S. Patent No, 4,683,202, as discussed 
above . 



15 C. Peptide Vaccine 

Any of the antigens of the invention can be 
used in preparation of a va:-cine. A preferred starting 
material for preparation of a vaccine is the particle 
antigen isolated from bile. The antigens are 
20 preferably initially recovered as intact particles as 
described above. However, it is also possible to pre- 
pare a suitable vaccine from particles isolated from 
other sources or non-particle recombinant antigens. 
When non-particle antigens are used (typically soluble 
25 antigens), proteins derived from the viral envelope or 
viral capsid are preferred for use in preparing vac- 
cines. These proteins can be purified by affinity 
chromatography, also described above. 

If the purified protein is not immunogenic 
10 per se, it can be bound to a carrier to make the 

protein immunogenic. Carriers include bovine serum 
albumin, keyhole limpet hemocyanin and the like. It is 
desirable, but not necessary, to purify antigens to be 
substantially free of human protein. However, it is 
5 more important that the antigens be free of proteins, 
viruses, and other substances not of human origin 
that may have been mtrociuced by way of, or 
contamination of, the nutrient medium, cell lines. 
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tissues, or pathological fluids from which the virus 
is cultured or obtained. 

Vaccination can be conducted in conventional 
fashion. For example, the antigen, whether a viral 
5 particle or a protein, can be used in a suitable 
diluent such as water, saline, buffered salines, 
complete or incomplete adjuvants, and the like. The 
immunogen is adr^iinistered using standard techniques 
for antibody induction, such as by subcutaneous 
10 administration of physiologically compatible, sterile 
solutions containing inactivated or attenuated virus 
particles or antigens . An ir^mune response producing 
amount of virus particles is typically administered 
per vaccinizing injecticn, typically in a volume of 

15 one milliliter or less. 

A specific example of a vaccine composition 
includes, in a pharmacologically acceptable adjuvant, 
a recombinant protein or protein mixture derived from 
an enterically transmitted nonA/nonB viral hepatitis 

20 agent whose genome contains a region which is 

homologous to the 1.3 3 kb DNA EcoRI insert present in 
plasmid pTZKFl ( ETl . 1 ) carried in coli strain BB4 , 
and having ATCC deposit no. 67 717. The vaccine is 
administered at periodic intervals until a significant 

25 titer of anti-ET-NANB antibody is detected in the 

serum. The vaccine is intended to protect against ET- 
NANB infection. 

Particularly preferred are vaccines prepared 
using proteins expressed by the 406.3-2 and 406.4-2 

30 clones described herein and equivalents thereof, 

including fragments of the expressed proteins. Since 
these clones have already been demonstrated to be 
reactive with a variety of human HEV-positive sera, 
their utility in protecting against a variety of HEV 

35 strains is indicated. 

D. Prophylactic and Therapeutic 
Antibodies and Antisera 
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In ad'-iiz: --\ ^ - is^i a? a vaccine, the 
compositions car. r-^ :se'~! * rrr^^rare antibodies to ET- 
NANB virus particles. The antibodies can be used 
directly as antiviral agont:-^ . To prepare antibodies, a 
host animal is imm^^nized using the virus particles or, 
as appropriate, non-particle antigens native to the 
virus particle are bound to a carrier as described 
above for vaccines. The nost serum or plasma is 
collected following an appropriate time interval to 
provide a composition c'^p:^r:sing antibodies reactive 
with the virus particle^. T:>^ gamma globulin fraction 
or the IgG antibodi.es can obtained, for example, by- 
use of saturated ammoniup sulfate or DEAE Sephadex, or 
other techniques known to those skilled in the art. 
The antibodies are substantially free of many of the 
adverse side effects which may be associated with 
other anti-viral agents such, as drugs. 

The antibody compositions can be made even 
more compatible with the host system by minimizing 
potential adverse immune system responses. This is 
accomplished by removing all or a portion of the FC 
portion of a foreign species antibody or using an 
antibody of the same species as the host animal, for 
example, the use of antibodies from human/human 
hybridomas , 

The antibodies can also be used as a means 
of enhancing the immune response since antibody-virus 
complexes are recognized by macrophages. The anti- 
bodies can be administered in amounts similar to those 
used for other therapeutic administrations of anti- 
body. For example, poole^i gamma globulin is admini- 
stered at 0.02-0.1 ml/lh hody weight during the early 
incubation of other viral diseases such as rabies, 
measles and hepatitis B to interfere with viral entry 
into cells. Thus, antibodies reactive with the ET-NANB 
virus particle can be passively administered alone or 
in conjunction with another anti-viral agent to a host 
infected with an ET-NANB virus to enhance the immune 
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response and/cr the effectiveness of an antiviral 
drug . 

Alternatively , ant i -ET-NANB-virus antibodies 
can be induced by adr^i m i.^ tc r i ng a nt i- idiotype anti- 
5 bodies as iminunoaens . Cor;ven lently , a purified anti-- 
ET-NANB-virus antihoay p r^^pd l d L ic n prepared as de- 
scribed above is ;;.-ed tc :i:ce a n t i - id iotype antibody 
in a host animal. The ccrrp:: s 1 1 ion is administered to 
the host animal in a suitable diluent. Following 
10 administration, usually repeated administration, the 
host produces ant i - id lot ype antibody. To eliminate an 
immunogenic response to the Fc region, antibodies pro- 
duced by the same species as the host animal can be 
used or the Fc region of the administered antibodies 
15 can be removed. Following induction of anti-idiotype 
antibody in the host animal, serum or plasma is 
removed to provide an antibody composition. The 
composition can be purified as described above for 
anti-ET-NANB virus antibodies, or by affinity 
20 chromatography using ant i-ET-NANB-virus antibodies 
bound to the affinity matrix. The anti-idiotype 
antibodies produced are similar in conformation to the 
authentic ET-NANB antigen and may be used to prepare 
an ET-NANB vaccine rather than using a ET-NANB 
25 particle antigen. 

When used as a means of inducing anti-ET- 
NANB virus antibodies in a patient, the manner of 
injecting the antibody is the same as for vaccination 
purposes , namely intramuscularly, intraperitoneally, 
30 subcutaneously or the like in an effective 

concentration in a physiologically suitable diluent 
with or without adjuvant. One or more booster 
injections may be desirable. The anti-idiotype method 
of induction of anti-ET-NANB virus antibodies can 
35 alleviate problems which may be caused by passive 

administration of ant i-ET-NArJB-virus antibodies, such 
as an adverse immune response, and those associated 
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with administration of purified blood components, such 
as infection with .r^s yet -jndiscovered viruses. 

The ET-NANB derived rror,ei:^s of the invention are 
also intended for use in r-r^ ■:uc antiserum designed 
5 for pre- or ^^s t -exposu re propnylaxis. Here an ET-NANB 
protein, or mixture of proteins is formulated with a 
suitable adjuvant and administered by injection to 
human volunteers, accord 11.7 to known methods for 
producing human an.tisera. Antibody response to the 
10 injected proteins is monit^^red, during a several- week 
period following immunization, by periodic serum 
sampling to detect the presence an anti-ET-NANB serum 
antibodies, as described in Section IIA above. 

The antiserum "rem immunized individuals may 
15 be administered as a pre-exposure prophylactic measure 
for individuals who are at risk of contracting 
infection. The antiserum is also useful in treating an 
individual post-exposure, analogous to the use of high 
titer antiserum against hepatitis B virus for post- 
20 exposure prophylaxis. 

E . Monoclonal Antibodies 

For both in vivo use of antibodies to ET- 
NANB virus particles and proteins and anti-idiotype 

25 antibodies and diagnostic use, it may be preferable to 
use monoclonal antibodies . Monoclonal anti-virus 
particle antibodies or anti-idiotype antibodies can be 
produced as follows. The spleen or lymphocytes from an 
immunized animal are removed and immortalized or used 

30 to prepare hybridomas by methods knov^.. to those 
skilled in the art. To produce a human-human 
hybridoma, a human lymphocyte donor is selected, A 
donor known to be infected with a ET-NANB virus (where 
infection has been shown for example by the presence 

35 of anti-virus antibodies in the blood or by virus 
culture) may serve as a suitable lymphocyte donor. 
Lymphocytes can be isolated from a peripheral blood 
sample or spleen cells may be used if the donor is 
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subject to splenectomy. Epstein-Barr virus (EBV) can 
be used to immortalize human lymphocytes or a human 
fusion partner can be used to produce human-human 
hybridomas . Primary in vitro immunization with 
peptides can also be used in the generation of human 
monoclonal antibodies . 

Antibodies secreted by the immortalized 
cells are screened to determine the clones that 
secrete antibodies of the desired specificity. For 
monoclonal anti-virus particle antibodies, the 
antibodies must bind to ET-NANB virus particles. For 
monoclonal anti-idiotype antibodies, the antibodies 
must bind to anti-virus particle antibodies. Cells 
producing antibodies of the desired specificity are 
selected. 

The following examples illustrate various 
aspects of the invention, but are in no way intended 
to limit the scope thereof. 

Material 

The materials used in the following Examples 
were as follows: 

Enzymes: DNAse I and alkaline phosphatase 
were obtained from Boehringer Mannheim Biochemicals 
(BMB, Indianapolis, IN); EcoRI, Eco RI methylase, DNA 
ligase, and DNA Polymerase I, from New England Biolabs 
(NEB, Beverly MA); and RNase A was obtained from Sigma 
(St, Louis, MO) . 

Other reagents: EcoRI linkers were obtained 
from NEB; and nitro blue tetrazolium (NBT), S-bromo-4- 
chloro-3-indolyl phosphate ( BCIP) S-bromo-4-chloro-3- 
indolyl-B-D-galactopyranoside (Xgal) and isopropyl B- 
D-thiogalactopyranoside (IPTG) were obtained from 
S igma . 

cDNA synthesis kit and random priming 
labeling kits are available from Boehringer-Mannheim 
Biochemical (BMB, Indianapolis, IN). 



20309587 
040591 



70 . 



Example 
Preparing cDNA Library 

A. Source of ET-NANB virus 

Two cynomolgus monkeys (cynos) were 
5 intravenously injected with a 10% suspension of a 

stool pool obtained from a second-passage cyno (cyno 
#37) infected with a strain of ET-NANB virus isolated 
from Burma cases whose stools were positive for ET- 
NANB, as evidenced by binding of 27-34 nm virus-like 

10 particles (VLPs) in the stool to immune serum from a 
known ETNANB patient. The animals developed elevated 
levels of alanine aminotransferase (ALT) between 24-36 
days after inoculation, and one excreted 27-34 nm 
VLPs in its bile in the pre-acute phase of infection. 

15 The bile duct of each infected animal was 

cannulated and about 1-3 cc of bile was collected 
daily. RNA was extracted from one bile specimen (cyno 
#121) by hot phenol extraction, using a standard RNA 
isolation procedure. Double-strand cDNA was formed 

2 0 from the isolated RNA by a random primer for first- 
strand generation, using a cDNA synthesis kit obtained 
from Boehringer-Mannheim (Indianapolis, IN). 

B. Cloning the Duplex Fragments 

25 The duplex cDNA fragments were blunt-ended 

with T4 DNA polymerase under standard conditions 
(Maniatis, p. 118), then extracted with 
phenol/chloroform and precipitated with ethanol. The 
blunt-ended material was ligated with EcoRI linkers 

30 under standard conditions (Maniatis, pp. 396-397) and 
digested with EcoRI to remove redundant linker ends. 
Non-ligated linkers were removed by sequential 
isopropanol precipitation . 

Lambda gtlO phage vector (Huynh) was 

35 obtained from Promega Biotec (Madison, WI ) , This 

cloning vector has a unique EcoRI cloning site in the 
phage CI repressor gene. The cDNA fragments from above 
were introduced into the EcoRI site by mixing 0.5 - 
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1.0 /ig EcoRI -c leaved gtlO, 0.5-3 /il of the above 
duplex fragments, 0.5 /il lOX ligation buffer, 0.5 /il 
ligase (200 units), and distilled water to 5 /il . The 
mixture was incubated overnight at 14°C, followed by 
5 vitro packaging, according to standard methods 

(Maniatis, pp. 256-263). 

The packaged phage were used to infect an E . 
coli hfl strain, such as strain HG415. Alternatively, 
E . coli , strain C600 hfl available from Promega 

10 Biotec, Madison, WI , could be used. The percentage of 
recombinant plagues obtained with insertion of the 
EcoRI-ended fragments was less than 5% by analysis of 
20 random plaques. 

The resultant cDNA library was plated and 

15 phage were eluted from the selection plates by 

addition of elution buffer. After DNA extraction from 
the phage, the DNA was digested with EcoRI to release 
the heterogeneous insert population, and the DNA 
fragments were fractionated on agarose to remove phage 

20 fragments. The 500-4,000 basepair inserts were 

isolated and recloned into lambda gtlO as above, and 
the packaged phage was used to infect E_^ coli strain 
HG415. The percentage of successful recombinants was 
greater than 95%. The phage library was plated on 

25 coli strain HG415, at about 5,000 plaques /plate , on a 
total of 8 plates. 

Example 2 
Selecting ET-NANB Cloned Fragments 
30 A. cDNA Probes 

Duplex cDNA fragments from noninfected and 
ETNANB-inf ected cynomolgus monkeys were prepared as in 
Example 1 . The cDNA fragments were radiolabeled by 
random priming, using a random-priming labeling kit 
35 obtained from Boehr inger-Mannheim (Indianapolis, IN). 

B. Clone Selection 
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The plated cDNA library from Example 1 was 
transferred to each of tv;o ni trocei iuiose filters, and 
the phage DV^A was fixed on the filters by baking, 
according to standari methods (Xaniatis, pp. 320323). 
5 The duplicate filters were hybridized with either 
inf ected-scorce or c:jntrol CDNA probes trom above. 
Autoradiograpns of tne filters were examined to 
identify library clones wnirh hybridized with 
radiolabeled ZullA probes fr^m infected source only, 
10 i.e., did nc^t hybridize witn cOrJA probes from the non- 

infected source. Sixteen such clones, out of a total 
of about 40,000 clones examined, were identified by 
this subtraction selection method. 

Each of the sixteen clones was picked and 
15 replated at low concentration on an agar plate. The 
clones on each plate were transferred to two nitro- 
cellulose ag duplicate lifts, and examined for hybrid- 
ization to radiolabeled cDNA probes from infected and 
noninfected sources, as above. Clones were selected 
20 which showed selective binding for infected-source 

probes (i.e., binding with infected-source probes and 
substantially no binding with non- infected-source 
probes) . One of the clones which bound selectively to 
probe from infected source was isolated for further 
25 study. The selected vector was identified as lambda 
gtlO-1.1, indicated in Figure 1. 

Example 3 
ET-NANB Sequence 

30 Clone lambda gtlO-1.1 from Example 2 was digested 

with EcoRI to release the heterologous insert, which 
was separated from the vector fragments by gel 
electrophoresis. The elec trophoret ic mobility of the 
fragment was consistent with a 1.33 kb fragment. This 

35 fragment, which contained EcoRI ends, was inserted 
into the EcoPI site of a pTZKFl vector, whose 
construction and properties are described in co-owned 
U.S. patent application for "Cloning Vector System and 
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Method for Rare Clone Identification", Serial No. 125, 
650, filed Novemoer 25, 1987. Briefly, and as 
illustrated in Figure 1, this plasmid contains a 
unique EcoRI site adjacent a T7 polymerase promoter 
5 site, and plasn^.id and phage origins of replication. 

The sequence irrjned ia te ly adjacent each side of the 
EcoRI site is known. E. coli BB4 bacteria, obtained 
from Stratagene (La Jclla, CA, were transformed with 
the plasmid. 

10 Radiolabeled ET-NANB probe was prepared by 

excising the 1.33 kb insert from the lambda gtlO-1.1 
phage in Example 2, separating the fragment by gel 
electrophoresis, and randomly labeling as above. 
Bacteria transfected with the above pTZKFl and 

15 containing the desired ET-NANB insert were selected by 
replica lift and hybridization with the radiolabeled 
ET-NANB probe, according to methods outlined in 
Example 2 . 

One bacterial colony containing a 
20 successful recombinant was used for sequencing a 
portion of the 1.33 kb insert. This isolate, 
designated pTZKFl ( ETl . 1 ) , has been deposited with the 
American Type Culture Collection, and is identified by 
ATCC deposit no. 67717. Using a standard dideoxy 
25 sequencing procedure, and primers for the sequences 
flanking the EcoRI site, about 200-250 basepairs of 
sequence from the 5 ' -end region and 3 '-end region of 
the insert were obtained. The sequences are given 
above in Section II. Later sequencing by the same 
30 techniques gave the full sequence in both directions, 
also given above. 

Example ^ 
Detecting ET-NANB Sequences 
35 cDNA fragment mixtures from the bile of 

noninfect*^d and ET-NANB-inf ected cynomolgus monkeys 
were prepared as above. The cDNA fragments obtained 
from human stool samples were prepared as follows. 
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Thirty ml of a LO^* '^\c^ . • ■ r: r ^ - : n obtained from an 
individual frcm • - : ; ,^3 infected with ET- 

NANB as a result ar^ r. -Mini:-eak, and a similar 

volume of stool ^'r'-r. .1 , '^.>:~n- infected 

5 individual, v;ere : sucrose density 

gradient cushion, - - ,r.- ^1 \ 25,000 x g for 6 

hr in an SW27 rctrr, a*. ' : . Tr.o pelleted material 
from the in f ec ton - -i: rr^e -cntained 27-34 nm VLP 

particles c ha rac t or l t : -i ET-NtANB infection in the 

10 infected-stool sanple. ^NA va^- ^:.-^lated from the 

sucrose-gradient collets :n o^th the infected and non- 
infected samples, and the isolated RNA was used to 
produce cDNA fragment-- as iosoribed in Example 1. 

The CDNA fraqir.ent :inx"ures from infected and 

15 non-infected bile sou roe, ajvi trcm infected and non- 
infected human-sto'-'l s^ur^or- w^^^r^. ^ach amplified by a 
novel linker /priner r-^-r^ :':a^, jcm^, trethod described in 
co-owned patent app:i:a: serial number 07/208,512 

for '*DNA Amplification aro r r. r a c t ion Technique," 

20 filed June 17, 19Fc. • : : * \' , --e fragments in each 

sample were blunt-onieo -.o t'^ ■^'.■-t I I then extracted 
with phenol/chloroform and procipitated with ethanol . 
The blunt-ended material vas ligated with linkers 
having the following sequence (top or 5' sequence has 
25 SEQ ID NO. 21; bottom or 3 'sequence has SEQ ID NO:22): 

5 ' -GGAATTCGCGGCCGCTCG-3 ' 
3 ' -TTCCTTAAGCGCCGGCGAGC-5 ' 

The duplex fi-avjmonts were digested with 
30 Nru l to remove linker iimers, mixed with a primer 

having the sequence 5 - 1 AT^C ;CGGCCGCTCG-3 ' , and then 
heat denatured and oo^r^leo tf^ room temperature to fonn 
single-strand DNA/pr:mer ocmploxes. The complexes were 
replicated to form o-ipl^-x fraom-^nts by addition of 
35 Thermus aquaticus {Taq~ p^^ y^e^ase and all four 
deoxynucleotides . The repl ica ti-^n procedures, 
involving successive stran^i dena turation , formation of 
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strand/primer complexes, and replication, was repeated 
2 5 times . 

The amplified cDNA sequences were 
fractionated by agarose gel electrophoresis, using a 
2% agarose matrix. After transfer of the DNA fragments 
from the agarose gels to nitrocellulose paper, the 
filters were hybridized to a random- 1 abe led 32p probe 
prepared by ( i ) treating the pTZKF 1 ( ET 1 . 1 ) plasmid 
from above with EcoRI, (li) isolating the released 
1.33 kb ET-NANB fragment, and (iii) randomly labeling 
the isolated fragment. The probe hybridization wag 
performed by conventional Southern blotting methods 
(Maniatis, pp. 382-389). Figure 2 shows the 
hybridization pattern obtained with cDNAs from 
infected (I) and non-infected (N) bile sources (2A) 
and from infected (I) and noninfected (N) human stool 
sources (2B). As seen, the ET-NANB probe hybridized 
with fragments obtained from both of the infected 
sources, but was non-homologous to sequences obtained 
from either of the non-infected sources, thus 
confirming the specificity of derived sequence. 

Southern blots of the radiolabeled 1,33 kb 
fragment with genomic DNA fragments from both human 
and cynomolgus -monkey DNA were also prepared. No 
probe hybridization to either of the genomic fragment 
mixtures was observed, confirming that the ET-NANB 
sequence is exogenous to either human or cynomolgus 
genome . 

Example 5 
Expressing ET-NANB Proteins 
A. Preparing ET-NANB Coding Sequences 

The pTZKFl ( ETl . 1 ) plasmid from Example 2 
was digested with EcoRI to release the 1.33 kb ET-NANB 
insert which was purified from the linearized plasmid 
by gel electrophoresis. The purified fragment was 
suspended in a standard digest buffer ( 0 . 5M Tris HCl, 
pH 7.5; 1 mg/ml BSA; lOmM MnC12) to a concentration of 
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about 1 mg/ml and digested vith DNAse I at room 
temperature for abouL 5 rax.nur.es. These reaction 
conditions were determinea from a prior calibration 
study, in which the incubation time required to 
5 produce pred-^m : nant i y lO'-^-^OO basepair fragments was 

determined. The materia: was extracted with 
phenol/chlorc f orm before ethanci precipitation. 

The fragments m the digest mixture were 
blunt-ended and Ligated with EcoRI linkers as in 

10 Example 1. The resultant fragments were analyzed by 

electrophoresis ;5-10V/cm) cn 1.2^ agarose gel, using 
PhiX174/HaeII I and lambda/Hindlll size markers. The 
100-300 bp fractinn was oluted onto NA45 strips 
(Schleicher and Schuell), which were then placed into 

15 1.5 ml microtubes with eiuting solution (1 M NaCl, 50 

mM arginine, pH 9.0), and incubated at 67^0 for 30-60 
minutes. The eluted DNA was phenol /chloroform 
extracted and then precipitated with two volumes of 
ethanol. The pellet was resuspended in 20 fil TE (0,01 

20 M Tris HCl, pH 7.5, 0.001 M EDTA) . 

B. Cloning in an Expression Vector 

Lambda gtll phage vector (Huynh) was 
obtained from Promega Biotec (Madison, WI ) . This 
cloning vector has a unique EcoRI cloning site 53 base 
pairs upstream from the beta-galactos idase translation 
termination codon. The genomic fragments from above, 
provided either directly from coding sequences 
(Example 5) or after amplification of rONA (Example 
4), were introduced into the EcoRI site by mixing 0.5- 
1.0 fig EcoRI-cleaved gtll, 0.3-3 /il of the above sized 
fragments, 0.5 /il lOX ligation buffer (above), 0.5 fAl 
ligase (200 units), and distilled water to 5 /il . The 
mixture was incubated overnight at 14°C, followed by 
35 in vitro packaging, according to standard methods 
(Maniatis, pp. 256-268). 

The packaged phage were used to infect E . 
coli_ strain KM392, obtained from Dr. Kevin Moore, DNAX 
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(Palo Alto, CA) . Alternatively, Coll strain Y1090, 

available from the Air.erican Type Culture Collection 
(ATCC #37197], could be used. The infected bacteria 
were plated and the resultant colonies were checked 
for loss of beta-galactosidase ac t i vi ty- ( c lear 
plaques) in the presence of X-gal using a standard X- 
gal substrate plaque assay method (Maniatis). About 
50% of the phage plaques showed loss of beta- 
galactosidase enz^^e activity (recombinants). 

C. Screening for ET-NANB Recombinant Proteins 

ET-NANB convalescent antiserum was obtained 
from patients infected during documented ET-NANB 
outbreaks in Mexico, Borneo, Pakistan, Somalia, and 
Bunna. The sera were immunoreac t ive with VLPs in stool 
specimens from each of several other patients with ET- 
NANB hepatitis . 

A lawn of E_^ coli KM392 cells infected with 
about 104 pfu of the phage stock from above was 
prepared on a 150 mm plate and incubated, inverted, 
for 5-8 hours at BV^C. The lawn was overlaid with a 
nitrocellulose sheet, causing transfer of expressed 
ETNANB recombinant protein from the plaques to the 
paper. The plate and filter were indexed for matching 
corresponding plate and filter positions. 

The filter was washed twice in TEST buffer 
(10 mM Tris, pH 8.0, 150 mM NaCl , 0.05% Tween 20), 
blocked with AIB (TBST buffer with 1% gelatin), washed 
again in TBST, and incubated overnight after addition 
of antiserum (diluted to 1:50 in AIB, 12-15 ml/plate). 
The sheet was washed twice in TBST and then contacted 
with enzyme-labeled anti-human antibody to attach the 
labeled antibody at filter sites containing antigen 
recognized by the antiserum. After a final washing, 
the filter was developed in a substrate medium 
containing 33 /il NBT (50 mg/ml stock solution 
maintained at 4°C) mixed with 16 /il BCIP (50 mg/ml 
stock solution maintained at 4^C) in 5 ml of alkaline 
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phosphatase buffer (ICO :nM Tris, 9.5, 100 mM NaCl, 5 
mM MgC12). Purple color appeared at points of antigen 
production, as recognized ry tl^.e antiserum. 

5 D. Screening Platip.g 

The areas of ar.^igen production determined 
in the previous step wer^-- replated at about 100-200 
pfu on an 82 mn plate. The above steps, beginning with 
a 5-8 hour incubation, through NBT-BCIP development, 
10 were repeated in order to plaque purify phage 

secreting an antigen capable of reacting with the ET- 
NANB antibody. The identified plaques were picked and 
elated in phage buffer (Maniatis, p. 443). 

15 E. Epitope Identification 

A series of subclones derived from the 
original pTZKFl (ETl.l) plasmid from Example 2 were 
isolated using the same techniques described above. 
Each of these five subclones were immunoreactive with 

20 a pool of anti-ET antisera noted in C. The subclones 
contained short sequences from the "reverse" sequence 
set forth previously. The beginning and ending points 
of the sequences in the subclones (relative to the 
full "reverse" sequence), are identified in the table 

25 below. 
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TABLE 1 

Subclone Pos it ion in " Reverse " Seauence 



5 - end 1 ' -end 

Yl 522 643 

Y2 594 667 

Y3 508 665 

Y4 558 752 

10 Y5 545 665 



Since all of the gene sequences identified 
in the table must contain the coding sequence for the 

15 epitope, it is apparent that the coding sequence for 

the epitope falls in the region between nucleotide 594 
(S'-end) and 643 {3'-end). Genetic sequences 
equivalent to and complementary to this relatively 
short sequence are therefore particularly preferred 

20 aspects of the present invention, as are peptides 
produced using this coding region. 

A second series of clones identifying an 
altogether different epitope was isolated with only 
Mexican serum. 

25 





TABLE 2 




Subclone 


Position in 


"Forward" Sequence 




5 ' end 


3' end 


ET 2-2 


2 


193 


ET 8-3 




135 


ET 9-1 


2 


109 


ET 13-1 




101 
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10 



15 



20 



25 



r r t,': is epitope falls 
. ar; ■: 10 1 (3 -end ) . 
c this short sequence are 
3s are peptides produced 



The J -dm-] sy 
between nuclectLd^_- 2 
Genetic sequences reiat' 
therefore also preferre- 
using this cc'ding region. 

Two particularly preierrea suiDclones for use 
in preparing polypeptides containing epitopes specific 
for HEV are the 4 Co. 3-2 and 4 0 6,4-2 clones whose 
sequences are set forth above. These sequences were 
isolated from an amplified cCNA library derived from a 
Mexican stool. Using the techniques described in this 
section, polypeptides expressed by these clones have 
been tested for irrLinunoreac t ivi ty against a number of 
different human HEV-pcsitive sera obtained from 
sources around the world. As shown in Table 3 below, 8 
sera immunoreac t ive with the polypeptide expressed by 
the 406.4-2, and 6 sera immunoreac ted with polypeptide 
expressed by the 406.3-2 clone. 

For comparison, the Table also shows 
reactivity of the various human sera with the Y2 clone 
identified in Table 1 above. Only one of the sera 
reacted with the polypeptide expressed by this clone. 
No immunoreactivity was seen for normal expression 
products of the gtll vector. 



Table 3 

Immunoreactivity of HEV Recombinant 
Proteins: Human Sera 



30 


Sera 


Source 


Stage 1 


406 . 3-2 


406 . 4-2 


Y2 


Xgtll 




FVH-21 


Burma 


A- 












FVH-8 


Burma 


A 










35 


SOM-19 


Soma 1 la 


A 












SOM-20 


Somal ia 


A 












IM-35 


Borneo 


A 












IM-36 


Borneo 


A 












PAK-1 


Pakistan 


A 










40 


FFI-4 


Mexico 


A 
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FFI-125 Mexico a - - - 

F 387 IC Mexico C - - ND 

Normal U.S.A. 



5 1a = acute; C = convalescent 

while the invention has been described with 
reference to particular embodiments, methods, 
construction and use, it will be apparent to those 
skilled in the art that various changes and 
10 modifications can be made without departing from the 
invention . 
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